Faculty of Language: Two cipher examples

Tuesday, April 9, 2019

Two cipher examples

I'm not convinced that I'm getting my point about "arbitrary" across, so maybe we should try some toy examples from a couple of ciphers. Let's encipher "Colorless green ideas" in a couple of ways.

1. Rot13: "Colorless green ideas" ⇒ "Pbybeyrff terra vqrnf". This method is familiar to old Usenet denizens. It makes use of the fact that the Latin alphabet has 26 letters by rotating them 13 places (a⇒n, b⇒o, ... m⇒z, n⇒a, o⇒b, ...) and so this method is its own inverse. That is, you decode a rot13 message by doing rot13 on it a second time. This is a special case of a Caesar cipher. Such ciphers are not very arbitrary as they mostly preserve alphabetic letter order, but they "wrap" the alphabet around into a circle (like color in the visual system) with "z" being followed by "a". In a rotation cipher, once you figure out one of the letter codes, you've got them all. So if "s" maps to "e" then "t" maps to "f" and so on.

2. Scrambled alphabet cipher: Randomly permute the 26 letters to other letters, for example A..Z ⇒ PAYRQUVKMZBCLOFSITXJNEHDGW. This is a letter-based codebook. This is arbitrary, at least from the individual letter perspective, as it won't preserve alphabetic order, encoding "Colorless" as "Yfcftcqxx". So knowing one letter mapping (c⇒y) won't let you automatically determine the others.

But this cipher does preserve various other properties, such as capitalization, number of distinct atomic symbols, spaces between words, message length, doubled letters, and sequential order in general.

Even word-based code books tend to preserve sequential order. That is, the message is input word-by-word from the beginning of the message to the end. But more sophisticated methods are possible, for example by padding the message with irrelevant words. It's less common to see the letters of the individual words scrambled, but we could do that for words of varying lengths, say by having words of length 2 reversed, 21, so that "to" would be encoded as "to" ⇒ "ot" ⇒ "fj". And words of length three might be scrambled 312, length four as 2431, and so on, choosing a random permutation for each word length. Adding this encryption technique will break apart some doubled letters. But the word order would still be preserved across the encryption.

These toy examples are just to show that "arbitrary" vs "systematic" isn't an all-or-nothing thing in a mapping. You have to consider all sorts of properties of the input and output representations and see which properties are being systematically preserved (or approximately preserved) across the mapping, and which are not. Temporal relations (like sequencing) are particularly important in this respect.

19 comments:

Joe CollinsApril 11, 2019 at 2:31 AM
I agree that reasoning in all-or-nothing terms is probably creating some discord. And since we're talking about ciphers, my instinct is to go even further and start talking about arbitrariness in terms of mutual information...

We can interpret the claim that phonological representations are arbitrary, as the claim that the mutual information between phonology and phonetics is ZERO (i.e. there is ZERO information about phonetics in phonology). In other words, phonology is a perfectly secret encryption of phonetics and you can only recover information about one from the other with a key (a lookup table/transducer/whatever-it-is). This seems to be what substance-free phonology is arguing for - anything less leaves you with at least a bit of phonetic substance, as far as I can see.

However, I can't see that the empirical arguments for substance-free phonology (weird rhotics, sign language, crazy rules, etc.) actually suggest ZERO mutual information. What they suggest is that the mutual information between phonology and phonetics is less than the channel capacity, i.e. some information about phonology can't be recovered from phonetics. This is a much weaker claim and seems to be perfectly commensurable with partial veridicality (and topographic representations, etc).
ReplyDelete
Replies
idsardiApril 11, 2019 at 12:04 PM
Hi Peter:

No, I am *not* suggesting that there are modality agnostic features in the way that you describe.

What I think is that there are features like [retroflex] which provide the initial state for the memory system connection between a motor action (innervation of the longitudinal muscles of the tongue) and an auditory percept (depressed F3). During phonological development these can be tuned somewhat to add enhancing motor accompaniments (lip rounding in English) and recruitment of allied auditory information. [retroflex] is a feature for speech, part of the speech MAP (memory-action-perception) loop.

There is a separate feature, let's call it [pointer] which connects the motor extension of the forefinger with the detection of an orthogonal angle in the object-centered part of the visual system, here focused on the hand. Again, this can be tuned somewhat during phonological development. This is a feature for sign.

To look for these kinds of connections we would examine the nature of the connections between sensory cortex and motor cortex, for example by comparing the -otopic maps in Mesgarani et al 2014 and Bouchard et al 2013, looking for topological similarities between the two maps.

The similarities between sign and speech phonologies are then due to the representation of information in the EFP/S format in both systems. If the systems have similar graph algebras, then they will also have similar graph operations.

Now, it is also possible that the connection between sensory cortex and motor cortex is via a multisensory integration area in STS, see Hickok & Poeppel 2007. If that is the case, then it is also possible that this is a source of shared properties between sign and speech phonologies.

Bouchard, K. E., Mesgarani, N., Johnson, K., & Chang, E. F. (2013). Functional organization of human sensorimotor cortex for speech articulation. Nature, 495(7441), 327–332.

Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews. Neuroscience, 8(5), 393–402.

Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. (2014). Phonetic feature encoding in human superior temporal gyrus. Science , 343(January), 1006–1010.
ReplyDelete
Replies
TobiasApril 22, 2019 at 11:21 AM
Hi,
I'm not a skilled blogger, reading and commenting only once in a while. So since I've last been active there are some things that were discussed I'd like to respond to. That comes in four installments because it's too long for a blog entry - sorry about that.
In the meantime I've finished a paper that discusses some issues that Bill brought up in his More on "arbitrary" post, namely:
1. is lexical translation as we know it from Vocabulary Insertion the odd man out in intermodular communication?
2. Gallistel & King's (2010: xi, 51-53) argument against lexical translation, or lookup tables as they call it.

The paper is called On the lexical character of intermodular communication and argues that lexical translation is not the odd man out but rather a good candidate to be generalized from the one interface we know well (Vocabulary Insertion) to other intermodular communication, and also to mappings between real-world items (such as wave lengths coming in nanometers) and cognitive categories (the color perceived that is associated to a given band of nm).
This is a first distinction to be made, I think, which may cause confusion (it did to me) if it isn't made explicit: intermodular communication is about the exchange of information between two cognitive modules (like morpho-syntax and phonology). When you cross the real-world boundary, i.e. when real-world items (such as wave lengths) are mapped onto cognitive categories (colors perceived), you are talking about something else since the real world is not a module. Real-world-to-cognitive mapping is done for all sensory input humans have, including the phonetics-phonology mapping that the discussion started with. Since cognitive categories are necessarily discrete and hence box a typically continuous sensory signal into a finite set of cognitive categories, it seems to me that all of these real-world-to-cognitive mappings are lexical, i.e. go through a look-up table. Is there any alternative for the match of, say, colors with wave lengths (and so on)?
In this context, what is Gallistel & King's actual argument against lexical translation? After having read them, there is no. They argue against lexical mapping for true modular activity, i.e. the relationship between the input of the module and its output. But they are not concerned with the interface, i.e. how modules exchange information. Their argument is obvious: modular computation cannot work with look-up tables since it potentially describes an infinite set of items (such as the set of well-formed sentences of a language). That's what they call infinitude of the possible, whose description can only be achieved by a (mathematical) function. So this does not tell us anything about how the interface works. Or rather it does, negatively: we know for sure that interfaces are workable with look-up tables because the information to be transmitted is necessarily made of a finite set of items. Between two modules, these are the items of the respective vocabularies, which are necessarily finite since they are stored in long term memory. To take the language example: the infinite set of outputs of the modular computation are written in a finite set of vocabulary items (discrete infinity), and translation is about converting these into some other vocabulary.
ReplyDelete
Replies
TobiasApril 22, 2019 at 11:22 AM
In all that there is no argument against lexical translation via look-up table. But there is actual evidence for interfaces to go through a lexicon since finite sets of items need to be matched that are incommensurable. Incommensurability is a property of items that are matched both when two cognitive modules communicate ("past tense" ↔ -ed), but also when a cognitive category relates to a real-world item (wave length ↔ color). The contention of substance-free phonological primes is that the only reason why analysts believe they can compare phonetic and phonological items is that they have named the latter according to the former themselves (phonetic labiality becomes [labial]). But that in fact the relationship between phonological and phonetic items is just like the one between wave lengths and colors.

Clarification
There was some discussion on the meaning of the word "arbitrary". Bill correctly points out that having an unpredictable "anything-goes" association between items of two separate lists (I hope I make myself clear, since I can't use the word "arbitrary" when discussing the definition of the word "arbitrary") does not mean that the relationship is non-systematic. Temporal relations such as sequencing may still be systematic and hard-wired even if any item can be associated with any other item.
I fully agree, and when I use the word "arbitrary" I only mean the above: the fact that any item of list A may be associated with any item of list B.

Apples and pears - a misunderstanding
In the RoZolution post discussion, Bill says that

"There are many examples in sensation and perception of lawful (non-arbitrary) intermodular communication. So intermodular communication does not imply arbitrary mappings. See, for example, any case of -otopic motor or sensory organization, such as retinotopy, tonotopy, somatotopy ( https://en.wikipedia.org/wiki/Topographic_map_(neuroanatomy) )."

William agrees:
"Thus the sensory-perceptual mapping isn't arbitrary at all, even in cases of cross-modal plasticity.
Or rather, the mapping is only partially arbitrary, as Bill has emphasized."

This is a misunderstanding: what you are talking about, Bill and William, are specific wirings in the brain. The Wikipedia entry says "neuroanatomy" in its title. No doubt these wirings are not arbitrary, or only partially arbitrary (like in cases of plasticity where brain space X usually devoted to activity A is recruited for activity B, as the recruitment of auditory cortex for visual purposes in deaf subjects).
That's all fine, but I am talking about the mind, not about the brain. Whatever the wirings in the brain, they won't tell us anything about how cognitive items of two distinct vocabularies are related (Vocabulary Insertion), or how a real-world item is associated to a cognitive category (wave length - color). The claim is that these associations are arbitrary (this word being defined, see above). There is no claim about how things are wired in the brain, and raw brain anatomy won't be able to contribute anything to the question as far as I can see.
ReplyDelete
Replies
TobiasApril 22, 2019 at 11:28 AM
(In)commensurability
Now having gone through this, I better understand your idea of "partial veridicality", Bill. It is a good match for this kind of neuroanatomic patterns. And looking at things from this angle, the lexical translation of Vocabulary Insertion where the to-be-related items are incommensurable (cannot even be compared) indeed appears to be the odd man out.
I reckon, then, that we are not talking about the same things, i.e. you about apples and me about pears. If we agree that we leave aside the brain-based stuff for a moment and only talk about information exchange between two modules *in the mind*, or about the association of a real-world item and a cognitive category, I would be interested to see whether we still disagree:
1. is there evidence for interfaces that are not list-based?
2. is there evidence for associations that correspond to partial veridicality, i.e. where the to-be-related items are commensurable, i.e. allow for the assessment of similarity?

In order to carry out a veridicality calculus at all, you need to be able to compare the similarity of the items associated. The thing is that you can't. In Vocabulary Insertion you can't: there is no sense in which "past tense" is any more or less similar to "-ed" than to "-a", "-pet" or "-trygrrd". Because the two vocabularies of which individual items are matched are different in kind – incommensurable. That's what Fodor's domain-specificity says. Could you imagine a case where two distinct domain-specific vocabularies would be commensurable, i.e. where it would be possible to assess the similarity of items that belong to either set?
The same goes for the association of real-world items with cognitive categories: trying to assess the (dis)similarity of "450-485 nm" and "blue", as opposed to, say, "450-485 nm" and "red" (or any other perceived color for that matter) is pointless. Wave lengths and perceived colors are incommensurable and you won't be able to tell whether the match is veridical, non-veridical or partially veridical.
The case of phonetics and phonology is a little different since the cognitive category (phonological) is not based on any percept reported by subjects: people have no idea what phonetic labiality is associated to in their phonology. Maybe there is perception of phonetic items, but we don't know anything about it since subjects have no conscious access to it. So analysts name the cognitive categories they need to talk about after the phonetic item they can observe: the phonetic event "labiality" becomes the cognitive category [labial]. But that does not tell us anything about what the cognitive item is made of – it only tells us something about the wish of analysts to talk about cognitive categories. There is no evidence independent of the phonetic event that would tell us about the identity of the cognitive category. This is precisely what the idea of substance-free melodic primes is about: the only thing we know is about the real-world item and the existence of an association with a cognitive category. So this knowledge is implemented in the transduction "alpha ↔ labiality". But other than that alpha lives its life in the phonology without knowing that it will eventually come out as labiality.
ReplyDelete
Replies
TobiasApril 22, 2019 at 11:29 AM
All this comes down to the following generalization, which as far as I can tell given the discussion is correct: interface relations are matches between items that are incommensurable, i.e. whose similarity cannot be calculated. Where "interfaces" include "between two cognitive modules" and "between a cognitive category and real-world items" (but which excludes any relationship involving brain-based items). Incommensurability follows from Fodor's domain-specificity.
A second generalization is that all interface relations of that kind that I have seen are list-type (look-up tables).
If this is true there is no way to talk about partial veridicality in interface matters.

P.S.
About ZERO phonetic information in the phonology that Joe Collins and Alex Clark discuss. Yes what Alex and Joe say in the end is exactly the way I am thinking about the matter:

"the two communicating participants […] must both know the secret key, or they can't communicate. If there is no MI, there can be no communication: in this case between these two modules.
So even if it is completely substance free, there must still be high MI."
Alex

"the key is the transducer between phonetics/phonology. So without the transducer, phonological representations would tell you literally nothing about phonetics - this is my understanding of what is claimed by substance-free phonology."
Joe

The secret key is the look-up table that matches items of the two modules. The modules couldn't communicate without that key, and the key in the case of the phonology-phonetics communication needs to be learned / discovered by the child during L1 acquisition. Dresher (2014) and Odden (2019) explain how that acquisition works.
In this setup there is literally ZERO phonetic information *in* the phonology. Phonetic information is only in the secret key = the transducer.
ReplyDelete
Replies
quyen12August 24, 2019 at 3:56 AM
Its a great pleasure reading your post.Its full of information I am looking for and I love to post a comment that "The content of your post is awesome" Great workJogo para criança online
play Games friv
free online friv Games
ReplyDelete
Replies

Add comment

Faculty of Language

Comments

Tuesday, April 9, 2019

Two cipher examples

19 comments:

Contributors