Monday, January 25, 2016

Three pieces to look at

I am waiting for links to David Poeppel’s three lectures and when I get them I will put some stuff up discussing them. As preview: THEY WERE GREAT!!! However, technical issues stand in the way of making them available right now and to give you something to do while you wait I have three pieces that you might want to peak at.

The first is a short article by Stephen Anderson (SA) (here). It’s on “language” behavior in non-humans. Much of it reviews the standard reasons for not assimilating what we do with what other “communicative” animals do. Many things communicate (indeed, perhaps everything does as SA states in the very first sentence) but only we do so using a system that of semantically arbitrary structured symbols (roughly words) that it combines to generate a discrete infinity of meanings (roughly syntax). SA calls this, following Hockett, the “Duality of Patterning” (5):

This refers to the fact that human languages are built on two essentially independent combinatory systems: phonology, and syntax. On the one hand, phonology describes the ways in which individually meaningless sounds are combined into meaningful units — words. And on the other, the quite distinct system of syntax specifies the ways in which words are combined to form phrases, clauses, and sentences.

Given Chomsky’s 60 year insistence on the centrality of hierarchical recursion and discrete infinity as the central characteristic of human linguistic capacity, the syntax side of this uniqueness is (or should be) well known. SA usefully highlights the importance of combinatoric phonology, something that Minimalists with their focus on the syntax to CI mapping may be tempted to slight. Chomsky, interestingly, has focused quite a lot on the mystery behind words, but he too has been impressed with their open textured “semantics” rather than their systematic AP combinatorics.[1] However, as SA notes, the latter is really quite important.

It is tempting to see the presence of phonology as simply an ornament, an inessential elaboration of the way basic meaningful units are formed. This would be a mistake, however: it is phonology that makes it possible for speakers of a language to expand its vocabulary at will and without effective limit. If every new word had to be constructed in such a way as to make it holistically distinct from all others, our capacity to remember, deploy and recognize an inventory of such signs would be severely limited, to something like a few hundred. As it is, however, a new word is constructed as simply a new combination of the inventory of familiar basic sound types, built up according to the regularities of the language’s phonology. This is what enables us to extend the language’s lexicon as new concepts and conditions require. (5)

So our linguistic atoms are peculiar not only semantically but phonetically as well. This is worth keeping in mind in Evolang speculations.

So, SA reviews some of basic ways that we differ from them when we communicate. It also ends with a critique of the tendency to semanticize (romanticize the semantics of) animal vocalizations. SA argues that this is a big mistake and that there is really no reason to think that animal calls have any interesting semantic features, at least if we mean by this that they are “proto” words. I agree with SA here. However, whether I do or not, if SA is correct, then it is important for there is a strong temptation (and tendency) to latch onto things like monkey calls as the first steps towards “language.” In other words, it is the first refuge of those enthralled by the “continuity” thesis (see here). It is thus nice to have a considered take down of the first part of this slippery slope.

There’s more in this nice compact little paper. It would even make a nice piece for a course that touches on these topics. So take a look.

The second paper is on theory refutation in science (here). It addresses the question of how ideas that we take to be wrong are scientifically weeded out. The standard account is that experiments are the disposal mechanism. This essay, based on the longer book that the author, Thomas Levenson has written (see here), argues that this is a bad oversimplification. The book is a great read, but the main point is well expressed here. It explains how long it took to loose the idea that Vulcan (you know Mr Spock’s birthplace) exists. Apparently, it took Einstein to kill the idea. Why did it take so long? Because, that Vulcan existed was a good idea that fit well with Newton’s ideas and that it experiment had a hard time disproving. Why? Because small modification of good theories are almost always able meet experimental challenges, and when there is nothing better on offer, such small modifications of the familiar are reasonable alternatives to dumping successful accounts. So, naive falsificationism (the favorite methodological stance of the hard headed, non nonsense scientist) rails to describe actual practice, at least in serious area of inquiry.

The last paper is by David Deutsch (here). The piece is a critical assessment of “artificial general intelligence” (AGI). The argument is that we are very far from understanding how thought works and that the contrary optimism that we hear from the CS community (the current leaders being the Bayesians) is based on an inductivist fallacy. Here’s the main critical point:

[I]t is simply not true that knowledge comes from extrapolating repeated observations. Nor is it true that ‘the future is like the past’, in any sense that one could detect in advance without already knowing the explanation. The future is actually unlike the past in most ways. Of course, given changes’ in the earlier pattern of 19s are straightforwardly understood as being due to an invariant underlying pattern or law. But the explanation always comes first. Without that, any continuation of any sequence constitutes ‘the same thing happening again’ under some explanation.

Note, the last sentence is the old observation about the vacuity of citing “similarity” as an inductive mechanism. Any two things are similar in some way. And that is the problem. That this has been repeatedly noted seems to have had little effect. Again and again the idea that induction based on similarity is the engine that gets us to generalizations we want keeps cropping up.  Deutsch notes that is still true with our most modern thinkers on the topic.

Currently one of the most influential versions of the ‘induction’ approach to AGI (and to the philosophy of science) is Bayesianism, unfairly named after the 18th-century mathematician Thomas Bayes, who was quite innocent of the mistake. The doctrine assumes that minds work by assigning probabilities to their ideas and modifying those probabilities in the light of experience as a way of choosing how to act. … As I argued above, that behaviourist, input-output model is appropriate for most computer programming other than AGI, but hopeless for AGI. It is ironic that mainstream psychology has largely renounced behaviourism, which has been recognised as both inadequate and inhuman, while computer science, thanks to philosophical misconceptions such as inductivism, still intends to manufacture human-type cognition on essentially behaviourist lines.

The only thing that Deutsch gets wrong in the above is the idea that main stream psych has gotten rid of its inductive bias. If only!

The piece is a challenge. I am not really fond of the way it is written. However, the basic point it makes is on the mark. There are serious limits to inductivism and the assumption that we are on the cusp of “solving” the problem is deserving of serious criticism.

So three easy pieces to keep you busy. Have fun.



[1] I put ‘semantics’ in scare quotes because of Chomsky does not think much of the idea that meaning has much to do with reference. See here and here for some discussion.

8 comments:

  1. I largely agree with the paragraphs you selected from the Deutsch piece, but the overall argument seems like one that shouldn’t be endorsed. If we assume for the sake of argument that AGI is possible and we just haven’t figured out how to do it yet, then it seems far more likely (to me, FWIW), that we’ll get there using induction plus the right selection of built-in biases than using whatever exactly Popperian epistemology turns out to be. There’s a certain irony in Deutsch’s invocation of the universality of computation. Pretty much any process by which you form beliefs on the basis of data can be understood in broadly Bayesian terms. Is the suggestion then that AI researchers should be looking for algorithms that have no Bayesian interpretation but do have a Popperian one? I can’t think what that would mean or how to go about doing it. If Deutsch has any ideas about it he appears to be keeping them to himself.

    ReplyDelete
    Replies
    1. I agree. The demural at the end was intended to point to this. Where I think he is right is where he echoes Goodman and others who have noted that induction means nothing until one specifies a similarity metric and that for most things of interest we have no idea what this is. The Popperian stuff is, well, IMO, not worth much. After saying we conjecture and refute the details become thin. I think for D these words stand in for a mystery, roughly what he says at the end. I think here he is right. What he points to in the CS and Bayes community is the hubris. It's all just learning drom the past and projecting to the future yada yada yada. No it isn't. What is it? Right now we deeply don't know.

      Delete
  2. Sandler et al 2011 (NLLT) and Sandler 2014 (here: http://septentrio.uit.no/index.php/nordlyd/article/view/2950) argue that Al-Sayyid Bedouin sign language, a newly emergent sign language, *is* a language (it has a lexicon and syntax) but *doesn't* have a combinatoric phonological system. They argue that it has a signature phonetics, though, and that a phonological system is in the process of emerging. Their claim might be generally consistent with what Anderson says about being restricted in terms of vocabulary size, and it might be easier for a sign language to develop a large-ish nonphonologically organized vocabulary than for a spoken language, but it seems pretty relevant.

    ReplyDelete
    Replies
    1. Agree on the relevance. Seems that the two systems are independent (not surprising) and that one doesn't "need" a large vocab to get a syntax going (we also knew this given young kids have a syntax with a pretty small vocab). It also looks like once you have a syntax then a phonology arises, and this is interesting. Wonder what drives this or if it is just a coincidence that this is the direction. COuld one have started with a vocab+phono and then gotten a syntax? Or is this the necessary direction. Interesting. Thoughts?

      Delete
    2. I don't know that we can say phonology only arises once there is a syntax, at least not in the acquisition of a language that already has phonology.

      But a possible reason why syntax does grease the skids of phonology is that very few (if any) of the things we can pronounce, in the adult language, are syntactically simplex. It doesn't much matter whether you are a Distributed Morphologist (bracketing paradox!), or a Spanning enthusiast – though I suppose Anderson himself would disagree with both camps – the consensus is that the idea of 'words' as syntactic atoms is a fallacy.

      When you say The dog jumped, there are at least three "pieces" to jumped (a root, a verbalized, and a tense suffix), and likely more. Same goes for 'dog' (root and nominalizer, and maybe more). Maybe the is simplex, but that doesn't much matter for the argument.

      We know that a lot of the action that informs phonology (e.g. information about allophonic alternations) comes from morphological alternations. If it takes syntax to build morphologically complex units (which is conceit both for DM and for Spanning), then you have a reason why the two might be linked in this way.

      Delete
    3. It seems as though syntax has to come early no matter what phonology is doing.

      Note that Al-Sayyid Bedouin Sign Language (ABSL) has functioned perfectly well for life in the village for decades despite lacking distinctive phonological features, so its vocabulary is presumably not tiny. I suppose it shouldn't be hard for a child to learn a couple thousand nonphonological signs, especially using compounds (e.g. "chicken"+"small_round_object" means 'egg' in ABSL); you could have thousands of words built from hundreds of roots. Their iconicity makes them easy to remember and allows a lot of variation (e.g. 'tea' in ABSL looks like drinking tea from a cup, regardless of how many fingers are used).

      It might not be possible for a spoken language without distinctive phonological features to build up such a big vocabulary (as Hockett suggested). One factor is that the articulators in spoken language are less versatile than the ones used in sign language. I think it would be hard to produce hundreds of distinct vocalizations without them either being easily confused or else being systematized in terms of something like phonemes.

      The second difference, I think, is that more of what we tend to talk about has a characteristic visual aspect than a characteristic sound. I'm just guessing but it seems that vocalizations are going to be less iconic for most of what we want to talk about, making them more abstract, and hence prone to conventionalization. The conventionalization again leads to something like phonemes and/or distinctive features.

      Delete
    4. That's very interesting. Jeff Heinz and me have argued recently that syntax and phonology are suprisingly similar on a formal level. The basic difference is that phonology operates over strings with tiers, whereas syntax operates over trees with tiers. However, the syntactic tree tiers include string tiers, so in a certain (loose) sense phonology is a fragment of syntax.

      This result still does not give you a clear directionality for acquisition since all three options are equally viable: 1) learning syntax involves learning string tiers, which are then co-opted for phonology; 2) learning phonology involves learning string tiers, which are then generalized to trees for syntax; 3) both types of tiers are gradually learned in parallel. However, the formal result does at least give a clear point of contact between syntax and phonology.

      The slides from our GLOW talk give a more complete picture of the technical details.

      Delete
  3. But what I'm really dying to know is, how did your debate with Hagoort go?

    ReplyDelete