Friday, March 29, 2019

More on "arbitrary"

Bill Idsardi

Alex and Tobias have upped the ante, raised the stakes and doubled down in the substance debate, advocating a "radical substance-free" position in their post.

I had been pondering another post on this topic myself since reading Omer's comment on his blog "a parallel, bi-directional architecture is literally the weakest possible architectural assumption". So I guess Alex and Tobias are calling my bluff, and I need to show my cards (again).

So I agree that "substance abuse" is bad, and I agree that minimization of substantive relationships is a good research tactic, but "substance-free" is at best a misnomer, like this "100% chemical free hair dye" which shoppers assume isn't just an empty box. A theory lacking any substantive connection with the outside world would be a theory about nothing.

And there's more to the question of "substance" than just entities, there are also predicates and relations over those entities. If phonology is a mental model for speech then it must have a structure and an interpretation, and the degree of veridicality in the interpretation of the entities, predicates and relations is the degree to which the model is substantive. Some truths about the entities, predicates and relations in the outside world will be reflected in the model, that's its substance. The computation inside the model may be encapsulated, disconnected from events in the world, without an interesting feedback loop (allowing, say, for simulations and predictions about the world) but that's a separate concept.

As in the case discussed by Omer, a lot of the debate about "substance" seems to rest on architectural and interface assumptions (with the phonology-phonetics-motor-sensory interfaces often termed "transducers" with nods to sensory transducers, see Fain 2003 for an introduction). The position taken by substance-free advocates is that the mappings achieved by these interfaces/transducers (even stronger, all interfaces) are arbitrary, with the canonical example being a look-up table, as exhibited by the lexicon. For example, from Scheer 2018:

“Since lexical properties by definition do not follow from anything (at least synchronically  speaking), the relationship between the input and the output of this spell-out is arbitrary: there is no reason why, say, -ed, rather than -s, -et or -a realizes past tense in English.
    The arbitrariness of the categories that are related by the translational process is thus a necessary property of this process: it follows from the fact that vocabulary items on either side cannot be parsed or understood on the other side. By definition, the natural locus of arbitrariness is the lexicon: therefore spell-out goes through a lexical access.
    If grammar is modular in kind then all intermodular relationships must instantiate the same architectural properties. That is, what is true and undisputed for the upper interface of phonology (with morpho-syntax) must also characterize its lower interface (with phonetics): there must be a spell-out operation whose input (phonological categories) entertain an arbitrary relationship with its output (phonetic categories).” [italics in original, boldface added here]

Channeling Omer then, spell-out via lookup table is literally the weakest possible architectural assumption about transduction. A lookup table is the position of last resort, not the canonical example. Here's Gallistel and King (2009: xi) on this point:

“By contrast, a compact procedure is a composition of functions that is guaranteed to generate (rather than retrieve, as in table look-up) the symbol for the value of an n-argument function, for any arguments in the domain of the function. The distinction between a look-up table and a compact generative procedure is critical for students of the functional architecture of the brain.”

I think it may confuse some readers that Gallistel and King talk quite a bit about lookup tables, but they do say "many functions can be implemented with simple machines that are incomparably more efficient than machines with the architecture of a lookup table" (p. 53).

Jackendoff 1997:107f (who advocates a parallel, bi-directional architecture of the language faculty by the way) struggles to find analogs to the lexicon:

"One of the hallmarks of language, of course, is the celebrated "arbitrariness of the sign," the fact that a random sequence of phonemes can refer to almost anything. This implies, of course, that there could not be language without a lexicon, a list of the arbitrary matches between sound and meaning (with syntactic properties thrown in for good measure).
   If we look at the rest of the brain, we do not immediately find anything with these same general properties. Thus the lexicon seems like a major evolutionary innovation, coming as if out of nowhere."

Jackendoff then goes on to offer some possible examples of lexicon-like associations: vision with taste ("mashed potatoes and French vanilla ice cream don't look that different") and skilled motor movements like playing a violin or speaking ("again it's not arbitrary, but processing is speeded up by having preassembled units as shortcuts.") But his conclusion ("a collection of stored associations among fragments of disparate representations") is that overall "it is not an arbitrary mapping".

As I have said before, in my opinion a mapping has substance to the extent that it has partial  veridicality. (Max in the comments to the original post prefers "motivated" to what I called "non-arbitrary" but see Burling who draws a direct opposition between "motivated" and "arbitrary".)

So I have two points to re-emphasize about partial veridicality: it's partial and it displays some veridicality.

Partially, not completely, veridical

This is the easy part, and the linguists all get this one. (But it was a continuing source of difficulty for some neuro people in my grad cogsci course over the years.) The sensory systems of animals are limited in dynamic range and in many other ways. The whole concept of a “just noticeable difference” means that there are physical differences that are below the threshold of sensory detection.  The fact that red is next to violet on the color wheel is also an example of a non-veridical aspect of color perception.

These are relatively easy because they are a bit like existence proofs. We just need to find some aspect of the system that breaks a relationship at a single point across the interface. Using T to represent transduction, we need to find a relation R such that R(x,y) holds but TR(Tx,Ty) does not hold everywhere or vice versa. In the color wheel example the "external" relation is wavelength distance, and the "internal" relation is perceptual hue similarity; violet is perceptually similar to red even though the wavelength of violet is maximally distant from red in the visible spectrum. (But otherwise wavelength distance is a good predictor of perceptual similarity.) And this same argument extends to intermodular relationships within the visual system, as in the mapping between the RGB hue representation in the retina and the R/G-Y/B opponent process representation in the lateral geniculate nucleus.

Partially, not completely, arbitrary

I am never forget the day
I am given first original paper to write
It was on analytical algebraic topology
Of locally Euclidean metrization
Of infinitely differentiable Riemannian manifold
Боже мой!
This I know from nothing
Tom Lehrer, "Lobachevsky"

This is somewhat harder to think about because one has to imagine really crazy functions (i.e. arbitrary functions in the mathematical sense, full lookup table functions). To put my cards on the table, I don't believe sensory transducers are capable of computing arbitrary functions (the place to look for this would be the olfactory system). I think they are limited to quasimorphims, capable of making some changes in topology (e.g. line to circle in color vision) but the functions are almost everywhere differentiable, offering a connection with manifold learning (Jansen & Niyogi 2006, 2013). I think Gallistel and King (2009: x) have pretty much the same view (though I think "homomorphism" is slightly too strong):

“Representations are functioning homomorphisms. They require structure-preserving mappings (homomorphisms) from states of the world (the represented system) to symbols in the brain (the representing system). These mappings preserve aspects of the formal structure of the world.” 

So here's another bumper sticker slogan: preserved structure is substance. 

It's homomorphic not isomorphic so the structure is not completely preserved (it's only partially veridical). But it doesn't throw out all the structure, which includes not just entities but also relationships among entities.

A small example of this sort can be found in Heffner et al 2019. Participants were asked to learn new categories, mappings between sounds and colors, with the sounds drawn from a fricative continuum between [x] and [ç] (1-10), and the associated colors drawn from the various conditions shown in the figure.

I don't think it should come as much of a surprise that "picket fence" and "odd one out" are pretty hard for people to learn. So the point here is that there is structure in the learning mechanism; mappings with fewer discontinuities are preferred.

Here's a similar finding from gerbils (Ohl 2001, 2009):

Ohl et al 2009: "Animals trained on one or more training blocks never generalized to pure tones of any frequency (e.g. start or stop frequencies of the modulated tone, or frequencies traversed by the modulation or extrapolated from the modulation). This could be demonstrated by direct transfer experiments (Ohl et al 2001, supplementary material) or by measuring generalization gradients for modulation rate which never encompassed zero modulation rates (Ohl et al 2001)." [pure tones have a zero modulation rate -- WJI]

That is, the gerbils don't choose a picket fence interpretation either, although that would work here, based on the starting frequency of the tone. Instead, they find the function with the fewest discontinuities that characterizes the data, based on their genetic endowment of spectro-temporal receptive fields (STRFs) in their primary auditory cortex. They don't get to invent new STRFs, let alone create arbitrary ones. The genetic endowment provides the structure for the sensory transductions, and thus some functions are learnable while many are not. So the resulting functions are partially, but not completely arbitrary. And they have a limited number of discontinuities.

By the way, exemplar (instance-based) learning models have no trouble with picket fence arrangements, learning them as quickly as they learn the other types.

OK, I think that's enough for now. I'll address my take on the relative priority of features and segments in another post.


Fain GL 2003. Sensory Transduction. Sinauer.

Gallistel CR, King AP 2009. Memory and the Computational Brain. Wiley-Blackwell.

Heffner CC, Idsardi WJ, Newman RS 2019. Constraints on learning disjunctive, unidimensional auditory and phonetic categories. Attention, Perception & Psychophysics.

Jackendoff R 1997. The Architecture of the Language Faculty. MIT Press.

Jansen A, Niyogi P 2006. Intrinsic Fourier analysis on the manifold of speech sounds. IEEE ICASSP. Retrieved from

Jansen A, Niyogi P 2013. Intrinsic Spectral Analysis. IEEE Transactions on Signal Processing, 61(7), 1698–1710.

Ohl FW, Scheich H, Freeman WJ 2001. Change in pattern of ongoing cortical activity with auditory category learning. Nature, 412(6848), 733–736.

Ohl FW, Scheich H 2009. The role of neuronal populations in auditory cortex for category learning. In Holscher C, Munk M (Eds.) Information Processing by Neuronal Populations. Cambridge University Press. 224-246.

Scheer T 2018. The workings of phonology and its interfaces in a modular perspective. In Annual conference of the Phonological Society of Japan. Retrieved from


  1. I'm strongly sympathetic to this post, and I also wonder if there's an evolutionary angle here too:

    Arbitrary representations make a lot of sense if you're dealing with a general purpose machine (e.g. a programmable computer). If you need your machine to represent everything from numbers to pictures to blog posts then it follows that the basic machinary for representing things can't be dedicated to any one of those tasks.

    But as soon as you a machine dedicated to one task (e.g. a cognitive module) then dedicated representations suddenly make a lot more sense. It's not obvious what advantage a bona fide arbitrary representation buys you if it's only doing one task anyway, so it's not clear how or why they could actually evolve in the first place. This probably explains why (afaik) everything we've discovered about neural representations suggests they are vaguely homomorphic in the sense described by Idsardi.

    In regards to phonology, this suggests that phonological representation should either be homomorphic with speech sounds, or else be an awkward exaptation from some kind of other neural representation (e.g. spatial representations) - or more likely bit of both.

    Of course this arguement also entails that the lexicon is very weird from an evolutionary perspective... but then we already knew that.

    1. Thanks for your comment Joe. I certainly agree that there is an evolutionary angle here which needs to be explored further. In regards to your point that "this suggests that phonological representation should either be homomorphic with speech sounds" I think that idea is consistent with the approach in Jakobson, Fant & Halle 1952 who sought definitions for distinctive features for both articulation and perception. (Though Morris later thought that they were primarily articulatory.) I think that the relation between gestural scores and "point notation" in Articulatory Phonology (Browman & Goldstein 1989) would also be an example of a quasimorphism.

  2. Partial veridicality
    [in three pieces, sorry about the length, I know…]

    I get your point, Bill. The Gallistel & King quote is also very clear. The idea is that whenever real-world things are associated to a cognitive category some of their real-world properties are carried into the cognitive system, which however allows some slack / independence. And also, no real-world thing can move into the cognitive system if the cognitive system does not provide for it beforehand (the cartesian-dualistic position you discuss with a Fodor quote in another of your posts).
    On the latter point: what do you do with the documented evidence to the contrary, i.e. where the brain-mind does make sense of real-world items in form of electric impulses that it brain/mind has never experienced, in perception (bionic eye) and production (bionic prosteses)?

    On the former point, a longer development
    I am not sure how the partial veridicality calculus you discuss for colour perception could be generalized to other areas of the cognitive system. It requires the measurement of two distances: 1) of two items on the real world side, 2) of the percept of these two items. Applied to the morpho-syntax - phonology interface, how do you calculate the distance between two morpho-syntactic objects, i.e. between an active and a passive structure? And how do you calculate the distance between two phonological objects, i.e. labial and continuant? Suppose you can come up with a calculus on both sides, how do you then calculate the match / mismatch of both distances? One is not the percept of the other. Trying to establish any veridicality for the mapping of morpho-syntactic and phonological items looks like mission impossible: no match is any more or less veridical than any other match.

    This may then be an argument showing that the morpho-syntax - phonology interface is the odd man out: unlike other cognitive areas, it does not allow for a veridical calculus. Fair enough. I believe that there is a reason for that, though, pertaining to the ontological status of the morpho-syntax - phonology interface: unlike the colour example and others, it does not involve any real-world items. The two vocabularies that need to be matched are purely cognitive. Hence they are incommensurable.

    But let us pursue the logic of the "odd man out": we expect that the morpho-syntax - phonology interface is a hapax, i.e. that other interfaces are different. Because going through a lexicon is strange (weak architectural assumption, no good parallels elsewhere in cognition). Let us then consider the phonology-phonetics interface, one which relates real-world items to cognitive categories (like colour, unlike morpho-syntax - phonology). Here we are able to calculate the distance between two real-world items such as labial vs. dental vs. dorsal. But there is trouble because the result may not be the same depending on whether you choose the real world type 1 or type 2. Type 1 is articulatory and says labial is closer to dental than to dorsal. Type 2 is acoustic and says labial is closer to dorsal than to dental. What do you choose? For other real-world items, no calculus appears to be possible: what's the distance between labial and continuant? Between spread glottis and dental, as opposed to spread glottis and labial? That's much unlike the clean and neat colour wave length which comes in numbers you can compare directly.
    There is more trouble when you look at the cognitive side of the coin: there is no perception of the real-world items at all. Ask a speaker what their perceived distance between labial and dental is… Speakers have no idea of all that and if there is a percept at all they have no conscious access to it.
    Hence even though just like in the colour example we are talking about the mapping of real-world items on related cognitive (perceptual) categories, we simply fail to carry out the distance calculus. That is, just like for the odd man out, there is no veridicality that can be calculated.

  3. This is why I said earlier that I am not sure how the colour-based veridicality calculus could be generalized to other areas of the cognitive system, i.e. how it could be a general property intermodular interfaces.
    Of course we as trained linguists can make the veridicality calculus: the phonological feature [labial] is also "labial" on the phonetic side, hence this is a match, not a mismatch. But of course that is cheating: we have no idea what phonetic "labiality" looks like in the cognitive system once it is grammaticalized (= categorized). The only reason phonologists attribute the cognitive identity [labial] (phonological feature) to it is that this item is pronounced "labial" (phonetics, real world). That's all circular (because we don't have the evidence from the percept).

    But let us pursue the veridicality calculus at the phonology-phonetics interface, since it is common practice among phonologists/phoneticians. They don't compare the distance of two items on either side (phonology and phonetics) and then compare the (mis)match. Rather, they compare just one item on the phonological side and the way it is pronounced. As I said, the result is always veridical, with some slack that is tolerated and ascribed to the motor system or other extra-grammatical causes: [coronal] in a [t] may come out as pre-coronal, alveolo-coronal, post-coronal etc. (across speakers or within a given speaker). Always veridical… except in some cases, which are not very frequent but exist. Alex' Glossa paper is about such a case, which people call a phonology-phonetics mismatch (Silke Hamann 2014 has collected a number of those in an OCP talk ): the sonorant /r/ (phonology) appears as [ʃ,ʒ] in Polish, [h] in Brazilian Portuguese or [ʁ,χ] in French and German and still in a number of other guises elsewhere. People are sure all these phonetic items are pronunciations of /r/ (rather than of what they are on the surface) because they show the phonological behaviour of /r/: for instance they appear as the second member of a branching onset, a privilege of sonorants.
    So that looks like a very nice instantiation of partial veridicality: 98% of phonology-phonetics mappings are faithful (veridical), while 1% or 2% are mismatches (non-veridical).
    The point Alex makes in his paper is that these mismatches show positively that the cognitive system is perfectly happy with them. They open a window on what the cognitive system is able to do, just like languages that exhibit a rare pattern, but which you won't want to miss since it is the only window you have on what is really possible. Since we thus know for sure that mismatches are possible, there are no grounds for considering that non-mismatches (non-veridical mappings) are in any way compelling, in general or for the particular cases that are faithful. That is, veridical mappings may well be just an accident from the viewpoint of the cognitive system, which is just as happy with veridical as with non-veridical mappings.

  4. Of course you will need to explain why 98% of the mappings are accidentally veridical. You don't need to reach far to get an answer that is entirely independent of the cognitive association: phonologists know about the life-cycle of phonological processes since the 19th century (Bermúdez-Otero 2014). Phonological processes are grammaticalizations of phonetic precursors, which upon innovation are all neat and regular and veridical and exceptionless and "natural" – because they are phonetics taken over into phonology. Once living in the cognitive system, they are divorced from their phonetic origin and its real-world constraints and may go off the track: they start being irregular, develop morphological conditions, have lexical exceptions etc. Bach & Harms (1972) put it this way: crazy rules (non-veridical mappings) are not born crazy, they become crazy through aging. But most rules / mappings don't live long enough in order to become crazy (or don't meet accidents that make them crazy during their life).

    Now Alex' argument I think may be applied to your colour example, Bill: the non-veridical mapping (violet is perceived as being close to red) shows that the cognitive system is happy to be non-veridical. The otherwise faithful veridical mapping is accidental and could as well be non-veridical (what's the cross-cultural record for colour closeness perception?). You would then need a good reason why violet-red goes off the track rather than, say, blue-yellow. Are there any reasons advanced why violet-red is the bad guy?

    In sum, then, if veridical mappings are accidental, mapping as such is arbitrary from the viewpoint of the cognitive system, which does not impose any restrictions on mapping. The existence of non-veridical mappings shows that veridical mappings, however frequent they are, exist for reasons independent of the cognitive system.

    Here is the pre-print of a paper of mine on the "radical" substance-free approach, entitled Phonetic arbitrariness and its consequences. Section 3.2 is on the fact that if their gut-feeling tells phonologists that some minor non-veridical mapping is not a big deal they call it slack (no problem), as opposed to a more pressing gut feeling for more serious non-veridical mapping, which is then called a mismatch (problem). But with no rationale for drawing a red line between what is just slack and what a serious mismatch is. [There is also a note on Volenec & Reiss 2018 written before the discussion on the Blog had begun].

    Bach, Emmon & R. T. Harms. 1972. How do languages get crazy rules? In Robert Stockwell & Ronald Macaulay (eds.), Linguistic change and generative theory, 1-21. Bloomington: Indiana University Press.
    Bermúdez-Otero, Ricardo. 2015. Amphichronic explanation and the life cycle of phonological processes. In Patrick Honeybone & Joseph C. Salmons (eds.), The Oxford handbook of historical phonology, 374-399. Oxford: OUP.
    Hamann, Silke. 2014. Phonetics-phonology mismatches. Paper presented at Old World Conference in Phonology, Leiden, 22-25 January.
    Scheer, Tobias. in press. Phonetic arbitrariness and its consequences. Phonological Studies.

  5. his is my first time i visit here. I found so many entertaining stuff in your blog, especially its discussion. From the tons of comments on your articles, I guess I am not the only one having all the leisure here! Keep up the excellent workjogos online 2019
    play Games friv
    school friv