Tuesday, April 9, 2019

Two cipher examples

I'm not convinced that I'm getting my point about "arbitrary" across, so maybe we should try some toy examples from a couple of ciphers. Let's encipher "Colorless green ideas" in a couple of ways.

1. Rot13: "Colorless green ideas" ⇒ "Pbybeyrff terra vqrnf". This method is familiar to old Usenet denizens. It makes use of the fact that the Latin alphabet has 26 letters by rotating them 13 places (a⇒n, b⇒o, ... m⇒z, n⇒a, o⇒b, ...) and so this method is its own inverse. That is, you decode a rot13 message by doing rot13 on it a second time. This is a special case of a Caesar cipher. Such ciphers are not very arbitrary as they mostly preserve alphabetic letter order, but they "wrap" the alphabet around into a circle (like color in the visual system) with "z" being followed by "a". In a rotation cipher, once you figure out one of the letter codes, you've got them all. So if "s" maps to "e" then "t" maps to "f" and so on.

2. Scrambled alphabet cipher: Randomly permute the 26 letters to other letters, for example A..Z ⇒ PAYRQUVKMZBCLOFSITXJNEHDGW. This is a letter-based codebook. This is arbitrary, at least from the individual letter perspective, as it won't preserve alphabetic order, encoding "Colorless" as "Yfcftcqxx". So knowing one letter mapping (c⇒y) won't let you automatically determine the others.

But this cipher does preserve various other properties, such as capitalization, number of distinct atomic symbols, spaces between words, message length, doubled letters, and sequential order in general.

Even word-based code books tend to preserve sequential order. That is, the message is input word-by-word from the beginning of the message to the end. But more sophisticated methods are possible, for example by padding the message with irrelevant words. It's less common to see the letters of the individual words scrambled, but we could do that for words of varying lengths, say by having words of length 2 reversed, 21, so that "to" would be encoded as "to" ⇒ "ot" ⇒ "fj". And words of length three might be scrambled 312, length four as 2431, and so on, choosing a random permutation for each word length. Adding this encryption technique will break apart some doubled letters. But the word order would still be preserved across the encryption.

These toy examples are just to show that "arbitrary" vs "systematic" isn't an all-or-nothing thing in a mapping. You have to consider all sorts of properties of the input and output representations and see which properties are being systematically preserved (or approximately preserved) across the mapping, and which are not. Temporal relations (like sequencing) are particularly important in this respect.


  1. I agree that reasoning in all-or-nothing terms is probably creating some discord. And since we're talking about ciphers, my instinct is to go even further and start talking about arbitrariness in terms of mutual information...

    We can interpret the claim that phonological representations are arbitrary, as the claim that the mutual information between phonology and phonetics is ZERO (i.e. there is ZERO information about phonetics in phonology). In other words, phonology is a perfectly secret encryption of phonetics and you can only recover information about one from the other with a key (a lookup table/transducer/whatever-it-is). This seems to be what substance-free phonology is arguing for - anything less leaves you with at least a bit of phonetic substance, as far as I can see.

    However, I can't see that the empirical arguments for substance-free phonology (weird rhotics, sign language, crazy rules, etc.) actually suggest ZERO mutual information. What they suggest is that the mutual information between phonology and phonetics is less than the channel capacity, i.e. some information about phonology can't be recovered from phonetics. This is a much weaker claim and seems to be perfectly commensurable with partial veridicality (and topographic representations, etc).

    1. I agree, sign language phonology and weird rhotics don’t conclusively demonstrate that phonological features are substance-free, but I still wonder how you could test the claim that features universally have some innate connection to substance, given how abstract the innate substance must be.

      If I’m understanding Bill correctly from the previous post, he suggests that there is a set F of universal innate features which are specific to language, by which I take to mean that they were selected in evolution for their advantages in language acquisition and/or use. The features in F are flexible enough in what they become grounded to in acquisition that a feature which gets its substance from, say, tongue curl in a spoken language, and which we annotate [retroflex], could alternatively get its substance from, say, finger curl in a sign language. Similarly, suppose that lowering the soft palate, which we annotate [nasal], could get its substance from having the palm turned down in a sign. (The labels [retroflex] and [nasal] appear in the feature inventory of the Avery & Idsardi 2001 paper to which Bill referred me.)

      But, I take it, given the partial veridicality requirement, the elements in F are not so flexible that the one we call [retroflex] could get its substance from nasality, or [nasal] from retroflexion, in a spoken language. And presumably they are also not so flexible that the sign language grounding of [retroflex] and [nasal] could be reversed, so that (in this toy example) [retroflex] was grounded in the palm facing down while [nasal] was grounded in finger curl in some sign languages but not others. If the members of F were completely unconstrained with respect to what they linked to in sign language, then assuming sign languages are acquired the same way and as easily as spoken languages, and are as communicatively effective, then F would be evolutionarily superfluous.

      So, I agree with you as far as the existence of F not being implausible, but I find it hard to see what predictions it makes, or how to confirm or disconfirm it.

      For instance, if the feature geometry were also universal, then you could look for parallels there --- in Avery & Idsardi’s feature geometry, [nasal] and [open] are grouped under “Soft Palate” and separate from [retroflex] and [down] (I think for dental sounds) under “Coronal: Tongue Curl”, and grouped features can be expected to pattern together against features outside their groups. But if the grouping is motivated by anatomy, while the partial veridicality of the features is sensitive to some other property, then we wouldn’t expect the geometry to be the same for spoken and signed languages, so no geometric isomorphism is predicted – curled fingers could be more closely linked to downward palm than nasality is to retroflexion. In the absence of that kind of prediction, it is hard for me to see how you could show whether finger curl is related to the same F as retroflexion (I mean absent some system for looking into the brain that goes beyond current technology).

    2. Joe, thank you. This is a very clear and useful idea for quantifying the degree of veridicality. It provides an idea for how to quantify information between sensory and motor areas too, perhaps along the lines of Granger causality?

    3. Peter-

      I was too slow and Bill has already answered most of your comment far better than I could.

      I’ll just add that my comment was not just that rhotics and sign-language fail to prove that phonology is substance-free. I also had in mind the potentially absurd conclusion that follows from taking literally the claim that phonological representations should contain ZERO phonetic information. This seems to entail that phonological representations are a perfectly secret encryption, which should strike us as very odd (why is the brain trying so hard to hide information from itself?).

      Just to elaborate what I mean: Even substance-free analyses don't actually contain ZERO phonetic information, because as soon as some phonological feature in an adult grammar correlates with something phonetic during speech then the mutual information between the two is non-zero. Adding the stipulation that the phonetic information is in the transducer and not the phonological features doesn't change this fact (we can't stipulate that information doesn't exist - either the data correlate or they don't). So my suspicion is that when someone argues that phonology should contain no phonetic information, they’re implicitly talking about something other than information in a natural sense. A phonological analysis from which ZERO phonetic information could be recovered would look nothing like any extant theory of phonology (and would be extremely opaque).

      I took this little reductio line as supporting Bill’s warning against all-or-nothing arguments.


      As far as I know, transfer entropy is the information theoretic measure used for inferring connectivity. But I honestly don’t know enough about it to say how well it would work in this context (probably though?).

    4. I don't think the tools of information theory will help here: even if there is a perfect encryption between phonology and phonetics, if it is a bijection (as all encryptions must be) then the mutual information will be maximal (i.e. equal to the entropy). Encryption is a computational issue in this case not a statistical one.

      If there is zero mutual information then they will be statistically independent and uncorrelated which isn't the case surely, since there are causal links.

      I think the general view in philosophy of mind nowadays is that information theory on its own is too weak to be used to resolve these sorts of representational issues.

    5. Hi Alex-

      I think I'm misunderstanding your first paragraph. The definition of "perfectly secret encryption" I learned was that the MI of the message and the cipher should be zero. By extension this is also the case where the entropy of the message is equal to the conditional entropy of the message and the cipher. So, I(M;C)=H(M)-H(M|C)=0.

      I think we agree though that this case is probably neither possible nor desirable for a theory of phonological representations - and I think a substance-free advocate would have to concede this too. This conclusion is relevant if we start a conversation from the assumption that "substance-free" means "contains zero phonetic information" because this can't really be true (in the Shannonian sense of information). This is the limit case of Bill's cipher examples, I think.

    6. Yes from the perspective of someone trying to decrypt the code (where the secret key has high entropy) , but not from the perspective of the two communicating participants (who must both know the secret key), or they can't communicate. If there is no MI, there can be no communication: in this case between these two modules.

      So even if it is completely substance free, there must still be high MI.

      I think this is the right way to think about it, (i.e not treating the secret key as an unknown random variable), especially when we consider the broader context of two people communicating with each other ? But as I type this out, I feel maybe I am misunderstanding the thought experiment.

    7. Ah, I see. I was imagining the key is the transducer between phonetics/phonology. So without the transducer, phonological representations would tell you literally nothing about phonetics - this is my understanding of what is claimed by substance-free phonology.

  2. Hi Peter:

    No, I am *not* suggesting that there are modality agnostic features in the way that you describe.

    What I think is that there are features like [retroflex] which provide the initial state for the memory system connection between a motor action (innervation of the longitudinal muscles of the tongue) and an auditory percept (depressed F3). During phonological development these can be tuned somewhat to add enhancing motor accompaniments (lip rounding in English) and recruitment of allied auditory information. [retroflex] is a feature for speech, part of the speech MAP (memory-action-perception) loop.

    There is a separate feature, let's call it [pointer] which connects the motor extension of the forefinger with the detection of an orthogonal angle in the object-centered part of the visual system, here focused on the hand. Again, this can be tuned somewhat during phonological development. This is a feature for sign.

    To look for these kinds of connections we would examine the nature of the connections between sensory cortex and motor cortex, for example by comparing the -otopic maps in Mesgarani et al 2014 and Bouchard et al 2013, looking for topological similarities between the two maps.

    The similarities between sign and speech phonologies are then due to the representation of information in the EFP/S format in both systems. If the systems have similar graph algebras, then they will also have similar graph operations.

    Now, it is also possible that the connection between sensory cortex and motor cortex is via a multisensory integration area in STS, see Hickok & Poeppel 2007. If that is the case, then it is also possible that this is a source of shared properties between sign and speech phonologies.

    Bouchard, K. E., Mesgarani, N., Johnson, K., & Chang, E. F. (2013). Functional organization of human sensorimotor cortex for speech articulation. Nature, 495(7441), 327–332.

    Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews. Neuroscience, 8(5), 393–402.

    Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. (2014). Phonetic feature encoding in human superior temporal gyrus. Science , 343(January), 1006–1010.

    1. Thanks for these clarifications. Now I see how you avoid the problem I was worried about of feature abstractness -- your features can be richer in their content because they are not modality independent. Do you further think that a feature like [pointer], which is not motivated by spoken language, is innate? That there has been some selective pressure on our ancestors to be able to rapidly identify a bent finger as such?

    2. Yes, my position is that it's innate. I think that there's some evidence for this in infant perception of biological motion, e.g. Simion et al 2008 (2-day-olds). The point-light display stuff on ASL seems to show that for the fingers it is the position of the fingertips that is the most important, Poizner et al 2008 (on adults).

      Kuhlmeier, V. A., Troje, N. F., & Lee, V. (2010). Young infants detect the direction of biological motion in point-light displays. Infancy: The Official Journal of the International Society on Infant Studies, 15(1), 83–93.

      Poizner, H., Bellugi, U., & Lutes-Driscoll, V. (1981). Perception of American sign language in dynamic point-light displays. Journal of Experimental Psychology. Human Perception and Performance, 7(2), 430–440.

      Simion, F., Regolin, L., & Bulf, H. (2008). A predisposition for biological motion in the newborn baby. Proceedings of the National Academy of Sciences of the United States of America, 105(2), 809–813.

    3. Sorry, Poizner et al is 1981, not 2008.

    4. Okay, now I am getting a better understanding of your position. Thanks for explaining, and thanks for all the references.

      So, a sign-language features like [pointer] is innate -- but presumably not specific to language, since it presumably did not emerge evolutionarily from pressures related to language acquisition or use, right?

      In that case, the segment inventory of a spoken language is based on features specific to language (perhaps motivated by the fact that e.g., whether your tongue is curled or not matters for vocalizations but not systematically for much else), while the segment inventory of a signed language is based on features not specific to language, if I'm reading you right.

    5. You raise a very tricky question. Or at least I think it's tricky. Various authors have suggested that the properties of the visual and auditory systems have developed as efficient, sparse codes based on natural scene statistics (e.g. Lewicki, Purves).

      As you say, the MAP connections for speech don't seem to be much good for anything else. The auditory code might also be efficient at non-speech classification into natural world sound classes (fricatives are like rushing water?) but the connection to the motor activity doesn't seem to do anything but speech. Maybe in some animals an auditory stimulus can produce a stereotypical orofacial response like biting? For me, I wouldn't want to try to build speech out of things like that.

      Connecting visual features for detecting biological motion and then executing an appropriate motor action of your own does seem to have greater general utility, e.g. a fish does a tail flip => stab it with a spear. So the visual MAP system seems to be wider than just language.

      Did the visual MAP features undergo subsequent specialization for sign language? That's going to be really hard to assess in my opinion. There is some work looking at ASL articulators (Malaia et al 2018), there they use the ankles as the control case (see their Fig. 1). So I guess that we would try to look for some kind of specialization to the upper body as opposed to the legs? Or look for privileged representations of the upper body in the multisensory integration area? For what it's worth, the fusiform face area is adjacent to STS. Is the fusiform body area divided at the waist? Maybe look at Corina & Knapp? This just looks really difficult to me.

      Corina, D. P., & Knapp, H. P. (2008). Signed language and human action processing. Annals of the New York Academy of Sciences, 1145(1), 100-112.

      Long, F., Yang, Z., & Purves, D. (2006). Spectral statistics in natural scenes predict hue, saturation, and brightness. Proceedings of the National Academy of Sciences of the United States of America, 103(15), 6013–6018.

      Malaia, E., Borneman, J. D., & Wilbur, R. B. (2018). Information Transfer Capacity of Articulators in American Sign Language. Language and Speech, 61(1), 97–112.

      Smith, E. C., & Lewicki, M. S. (2006). Efficient auditory coding. Nature, 439(7079), 978–982.

      Stilp, C. E., & Assgari, A. A. (2015). Languages across the world are efficiently coded by the auditory system. Proceedings of Meetings on Acoustics, 23(1), 060003.

  3. Hi,
    I'm not a skilled blogger, reading and commenting only once in a while. So since I've last been active there are some things that were discussed I'd like to respond to. That comes in four installments because it's too long for a blog entry - sorry about that.
    In the meantime I've finished a paper that discusses some issues that Bill brought up in his More on "arbitrary" post, namely:
    1. is lexical translation as we know it from Vocabulary Insertion the odd man out in intermodular communication?
    2. Gallistel & King's (2010: xi, 51-53) argument against lexical translation, or lookup tables as they call it.

    The paper is called On the lexical character of intermodular communication and argues that lexical translation is not the odd man out but rather a good candidate to be generalized from the one interface we know well (Vocabulary Insertion) to other intermodular communication, and also to mappings between real-world items (such as wave lengths coming in nanometers) and cognitive categories (the color perceived that is associated to a given band of nm).
    This is a first distinction to be made, I think, which may cause confusion (it did to me) if it isn't made explicit: intermodular communication is about the exchange of information between two cognitive modules (like morpho-syntax and phonology). When you cross the real-world boundary, i.e. when real-world items (such as wave lengths) are mapped onto cognitive categories (colors perceived), you are talking about something else since the real world is not a module. Real-world-to-cognitive mapping is done for all sensory input humans have, including the phonetics-phonology mapping that the discussion started with. Since cognitive categories are necessarily discrete and hence box a typically continuous sensory signal into a finite set of cognitive categories, it seems to me that all of these real-world-to-cognitive mappings are lexical, i.e. go through a look-up table. Is there any alternative for the match of, say, colors with wave lengths (and so on)?
    In this context, what is Gallistel & King's actual argument against lexical translation? After having read them, there is no. They argue against lexical mapping for true modular activity, i.e. the relationship between the input of the module and its output. But they are not concerned with the interface, i.e. how modules exchange information. Their argument is obvious: modular computation cannot work with look-up tables since it potentially describes an infinite set of items (such as the set of well-formed sentences of a language). That's what they call infinitude of the possible, whose description can only be achieved by a (mathematical) function. So this does not tell us anything about how the interface works. Or rather it does, negatively: we know for sure that interfaces are workable with look-up tables because the information to be transmitted is necessarily made of a finite set of items. Between two modules, these are the items of the respective vocabularies, which are necessarily finite since they are stored in long term memory. To take the language example: the infinite set of outputs of the modular computation are written in a finite set of vocabulary items (discrete infinity), and translation is about converting these into some other vocabulary.

  4. In all that there is no argument against lexical translation via look-up table. But there is actual evidence for interfaces to go through a lexicon since finite sets of items need to be matched that are incommensurable. Incommensurability is a property of items that are matched both when two cognitive modules communicate ("past tense" ↔ -ed), but also when a cognitive category relates to a real-world item (wave length ↔ color). The contention of substance-free phonological primes is that the only reason why analysts believe they can compare phonetic and phonological items is that they have named the latter according to the former themselves (phonetic labiality becomes [labial]). But that in fact the relationship between phonological and phonetic items is just like the one between wave lengths and colors.

    There was some discussion on the meaning of the word "arbitrary". Bill correctly points out that having an unpredictable "anything-goes" association between items of two separate lists (I hope I make myself clear, since I can't use the word "arbitrary" when discussing the definition of the word "arbitrary") does not mean that the relationship is non-systematic. Temporal relations such as sequencing may still be systematic and hard-wired even if any item can be associated with any other item.
    I fully agree, and when I use the word "arbitrary" I only mean the above: the fact that any item of list A may be associated with any item of list B.

    Apples and pears - a misunderstanding
    In the RoZolution post discussion, Bill says that

    "There are many examples in sensation and perception of lawful (non-arbitrary) intermodular communication. So intermodular communication does not imply arbitrary mappings. See, for example, any case of -otopic motor or sensory organization, such as retinotopy, tonotopy, somatotopy ( )."

    William agrees:
    "Thus the sensory-perceptual mapping isn't arbitrary at all, even in cases of cross-modal plasticity.
    Or rather, the mapping is only partially arbitrary, as Bill has emphasized."

    This is a misunderstanding: what you are talking about, Bill and William, are specific wirings in the brain. The Wikipedia entry says "neuroanatomy" in its title. No doubt these wirings are not arbitrary, or only partially arbitrary (like in cases of plasticity where brain space X usually devoted to activity A is recruited for activity B, as the recruitment of auditory cortex for visual purposes in deaf subjects).
    That's all fine, but I am talking about the mind, not about the brain. Whatever the wirings in the brain, they won't tell us anything about how cognitive items of two distinct vocabularies are related (Vocabulary Insertion), or how a real-world item is associated to a cognitive category (wave length - color). The claim is that these associations are arbitrary (this word being defined, see above). There is no claim about how things are wired in the brain, and raw brain anatomy won't be able to contribute anything to the question as far as I can see.

  5. (In)commensurability
    Now having gone through this, I better understand your idea of "partial veridicality", Bill. It is a good match for this kind of neuroanatomic patterns. And looking at things from this angle, the lexical translation of Vocabulary Insertion where the to-be-related items are incommensurable (cannot even be compared) indeed appears to be the odd man out.
    I reckon, then, that we are not talking about the same things, i.e. you about apples and me about pears. If we agree that we leave aside the brain-based stuff for a moment and only talk about information exchange between two modules *in the mind*, or about the association of a real-world item and a cognitive category, I would be interested to see whether we still disagree:
    1. is there evidence for interfaces that are not list-based?
    2. is there evidence for associations that correspond to partial veridicality, i.e. where the to-be-related items are commensurable, i.e. allow for the assessment of similarity?

    In order to carry out a veridicality calculus at all, you need to be able to compare the similarity of the items associated. The thing is that you can't. In Vocabulary Insertion you can't: there is no sense in which "past tense" is any more or less similar to "-ed" than to "-a", "-pet" or "-trygrrd". Because the two vocabularies of which individual items are matched are different in kind – incommensurable. That's what Fodor's domain-specificity says. Could you imagine a case where two distinct domain-specific vocabularies would be commensurable, i.e. where it would be possible to assess the similarity of items that belong to either set?
    The same goes for the association of real-world items with cognitive categories: trying to assess the (dis)similarity of "450-485 nm" and "blue", as opposed to, say, "450-485 nm" and "red" (or any other perceived color for that matter) is pointless. Wave lengths and perceived colors are incommensurable and you won't be able to tell whether the match is veridical, non-veridical or partially veridical.
    The case of phonetics and phonology is a little different since the cognitive category (phonological) is not based on any percept reported by subjects: people have no idea what phonetic labiality is associated to in their phonology. Maybe there is perception of phonetic items, but we don't know anything about it since subjects have no conscious access to it. So analysts name the cognitive categories they need to talk about after the phonetic item they can observe: the phonetic event "labiality" becomes the cognitive category [labial]. But that does not tell us anything about what the cognitive item is made of – it only tells us something about the wish of analysts to talk about cognitive categories. There is no evidence independent of the phonetic event that would tell us about the identity of the cognitive category. This is precisely what the idea of substance-free melodic primes is about: the only thing we know is about the real-world item and the existence of an association with a cognitive category. So this knowledge is implemented in the transduction "alpha ↔ labiality". But other than that alpha lives its life in the phonology without knowing that it will eventually come out as labiality.

  6. All this comes down to the following generalization, which as far as I can tell given the discussion is correct: interface relations are matches between items that are incommensurable, i.e. whose similarity cannot be calculated. Where "interfaces" include "between two cognitive modules" and "between a cognitive category and real-world items" (but which excludes any relationship involving brain-based items). Incommensurability follows from Fodor's domain-specificity.
    A second generalization is that all interface relations of that kind that I have seen are list-type (look-up tables).
    If this is true there is no way to talk about partial veridicality in interface matters.

    About ZERO phonetic information in the phonology that Joe Collins and Alex Clark discuss. Yes what Alex and Joe say in the end is exactly the way I am thinking about the matter:

    "the two communicating participants […] must both know the secret key, or they can't communicate. If there is no MI, there can be no communication: in this case between these two modules.
    So even if it is completely substance free, there must still be high MI."

    "the key is the transducer between phonetics/phonology. So without the transducer, phonological representations would tell you literally nothing about phonetics - this is my understanding of what is claimed by substance-free phonology."

    The secret key is the look-up table that matches items of the two modules. The modules couldn't communicate without that key, and the key in the case of the phonology-phonetics communication needs to be learned / discovered by the child during L1 acquisition. Dresher (2014) and Odden (2019) explain how that acquisition works.
    In this setup there is literally ZERO phonetic information *in* the phonology. Phonetic information is only in the secret key = the transducer.

  7. Its a great pleasure reading your post.Its full of information I am looking for and I love to post a comment that "The content of your post is awesome" Great workJogo para criança online
    play Games friv
    free online friv Games