Faculty of Language: 2019

Thursday, December 12, 2019

200k, 27M YBP?

In The Atlantic today, a summary of a new article in Science Advances this week about speech evolution:

https://www.theatlantic.com/science/archive/2019/12/when-did-ancient-humans-start-speak/603484/?utm_source=feed

https://advances.sciencemag.org/content/5/12/eaaw3916

I think Greg Hickok had the most trenchant comment, that people are hoping “that there was one thing that had to happen and that released the linguistic abilities.” And John Locke had the best bumper sticker, “Motor control rots when you die.”

As the authors say in the article, recent work has shown that primate vocal tracts are capable of producing some vowel sounds:

https://advances.sciencemag.org/content/2/12/e1600723

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0169321

This is certainly interesting from a comparative physiological perspective, and the article has a great summary of tube models for vowels. But I don't think that "producing vowel sounds" should be equated with "having speech" in the sense of "having a phonological system". My own feeling is that we should be looking for a couple of things. First, the ability to pair non-trivial sound sequences (phonological representations) with meanings in long term memory. Some nonhuman animals (including dogs) do have this ability, or something like it, so this isn't the lynch pin.

http://science.sciencemag.org/content/304/5677/1682.short

Second, the emergence of speech sound sequencing abilities in both the motor and perceptual systems. That is, the ability to perform computations over sequences; to compose, decompose and manipulate sequences of speech sounds, which includes concatenation, reduplication, phonotactic patterning, phonological processes and so on. The findings closest to showing this for nonhuman animals (birds in this case) that I am aware of are in:

https://www.nature.com/articles/ncomms10986

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2006532

In those papers the debate is framed in terms of syntax, which I think is misguided. But the experiments do show some sound sequencing abilities in the birds which might coincide with some aspects of human phonological abilities. But, of course, this would be an example of convergent evolution, so it tells us almost nothing about the evolutionary history in primates.

Monday, August 19, 2019

Herb Terrace on Nim and Noam

From my colleague Greg Ball, news about a new blog and book by Herb Terrace which will ask how words evolved (as opposed to grammar).

Monday, July 8, 2019

Interesting Post Doc possibility for linguists with cogneuro interests

William Matchin sent me this post doc opportunity for posting on FoL

Salary and benefits are commensurate with experience. The position is for one year, renewable for a second year, and potentially further pending the acquisition of grant funding.

The postdoctoral associate will work in close association with the Aphasia Lab (headed by Dr. Julius Fridriksson) as part of the NIH-funded Center for the Study of Aphasia Recovery (C-STAR). The NeuroSyntax lab is also part of the Linguistics program, the Neuroscience Community, and the Center for Mind and Brain at UofSC.

The University of South Carolina is in historic downtown Columbia, the capitol of South Carolina. Columbia is centrally located within the state, with a two-hour drive to the beach (including historic Charleston, SC) and the mountains (including beautiful Asheville, NC).

If you are interested in this position, please send an email to Prof. William Matchin matchin@mailbox.sc.eduwith your CV and a brief introduction to yourself, your academic background, and your research interests. You can find more details and apply online: https://uscjobs.sc.edu/postings/60022.

Wednesday, July 3, 2019

Postdoc position at South Carolina

A Postdoctoral Fellow position is available at the University of South Carolina, under the direction of Prof. William Matchin in the NeuroSyntax laboratory. The post-doc will help develop new projects and lead the acquisition, processing, and analysis of behavioral and neuroimaging data. They will also assist with the organization of the laboratory and coordination of laboratory members. We are particularly interested in candidates with a background in linguistics who are interested in projects at the intersection of linguistics and neuroscience. For more information about our research program, please visit www.williammatchin.com.

Salary and benefits are commensurate with experience. The position is for one year, renewable for a second year, and potentially further pending the acquisition of grant funding.

The postdoctoral associate will work in close association with the Aphasia Lab (headed by Dr. Julius Fridriksson) as part of the NIH-funded Center for the Study of Aphasia Recovery (C-STAR). The NeuroSyntax lab is also part of the Linguistics program, the Neuroscience Community, and the Center for Mind and Brain at UofSC.

The University of South Carolina is in historic downtown Columbia, the capitol of South Carolina. Columbia is centrally located within the state, with a two-hour drive to the beach (including historic Charleston, SC) and the mountains (including beautiful Asheville, NC).

If you are interested in this position, please send an email to Prof. William Matchin matchin@mailbox.sc.edu with your CV and a brief introduction to yourself, your academic background, and your research interests. You can find more details and apply online: https://uscjobs.sc.edu/postings/60022.

Monday, June 17, 2019

The speed of evolution in domestication

In PNAS today an article on "Evolution of facial muscle anatomy in dogs" which argues for an adaptation of canine facial anatomy in the context of domestication.

From the abstract: "Domestication shaped wolves into dogs and transformed both their behavior and their anatomy. Here we show that, in only 33,000 y, domestication transformed the facial muscle anatomy of dogs specifically for facial communication with humans."

Tuesday, June 4, 2019

Bees learn symbols for the numbers 2 and 3

In PTRSB today, bees learned to associate "N" with groups of 2, and "⊥" with groups of 3.

From the discussion: "Our findings show that independent groups of honeybees can learn and apply either a sign-to-numerosity matching task or a numerosity-to-sign matching task and subsequently apply acquired skills to novel stimuli. Interestingly, despite bees demonstrating a direct numerosity and sign association, they were unable to transfer the acquired skill to solve a reverse matching task."

So, remembering watching Romper Room as a child, these bees are clearly Do Bees.

Friday, May 17, 2019

Why do geese honk?

A question to CBC Radio's Quirks and Quarks program, "Why do Canada geese honk while migrating?" The answer the CBC gives is "They honk to communicate their position in the flock". But Elan Dresher gave a different answer back in 1996.

New blog: Outdex

Here's a new blog, Outdex, featuring Thomas Graf (and friends), that should be of interest to people who read FoL. This week, Thomas has a post on the "inverted T" (or "inverted Y") model of Generative Grammar.

Thursday, May 9, 2019

GG + NN = Thing 1 + Thing 2?

Language, ~~stealing~~ adopting the BBS model, has a target article by Joe Pater and several replies.

Here's my attempt at bumper-sticker summaries for the articles. You can add your own in the comments. (No, there are no prizes for this.)

Pater: GG + NN + ?? = Profit!

Berent & Marcus: Structure + Composition = Algebra

Dunbar: Marr + Marr = Marr

Linzen: NNs learn GGs sometimes, sorta

Pearl: ?? = interpretability

Potts: Functions + Logic = Vectors + DL

Rawski & Heinz: No Free Lunch, but there is a GI tract

Pater starts out with the observation that Syntactic Structures and "The perceptron: A perceiving and recognizing automaton" were both published in 1957.

Here is a list of other things that were published in 1957 (hint: 116). It may say too much about me, but some of my favorites over the years from this list have included: The Cat in the Hat, From Russia with Love, The Way of Zen, Endgame and Parkinson's Law. But I'm afraid I can't really synthesize all that into an enlightened spy cat whose work expands to fill the nothingness. You can add your own mash-ups in the comments. (No, there are no prizes for this either.)

Sunday, April 28, 2019

Scheering forces

I'll respond to Tobias in a post rather than a blog reply because he raises several points, and I want to include a picture or two.

1. TS: "When you cross the real-world boundary, i.e. when real-world items (such as wave lengths) are mapped onto cognitive categories (colors perceived), you are talking about something else since the real world is not a module."

The arguments I was making hold equally well for representations within the central nervous system (CNS), for example between the retina, the lateral geniculate nucleus and V1. Real-world spatial relations are mapped partially veridically onto the retina (due to the laws of optics). The spatial organization of the retina is (partially) maintained in the mapping to cortex; that is, LGN and V1 are retinotopic. So the modules here are the retina, LGN and V1, which are certainly modules within the CNS.

The same sort of relationship is true for acoustic frequency, the cochlea, the medial geniculate nucleus (MGN), and A1. Acoustic frequencies are mapped partially veridically onto the coiled line of hair cells in the cochlea (due to laws of acoustics). That is, frequency is mapped into a spatial (place) code at the cochlea (this is not the only mechanism for low frequencies). And the cochlear organization is partially preserved in the mappings to MGN and A1, they are cochleotopic (= tonotopic). There is an "arbitrary" aspect here: frequency is represented with a spatial code. But the spatial code is not completely arbitrary or random, but organized and ordinal, such that frequency increases monotonically from the apex to the base in the cochlea, as shown in the diagram from Wikipedia, and is preserved in tonotopic gradients in A1. That is, the mappings between the modules are quasimorphisms.

2. TS: "when I use the word "arbitrary" I only mean the above: the fact that any item of list A may be associated with any item of list B."

Then I think you should find a different term. I also think there has been far too much focus on the items. As I have tried to explain, items enter into relationships with other items, and we need to consider the preservation of these relationships across the interface or the lack thereof; we need to keep track of the quasimorphisms. So it is not the case for many of the intermodular interfaces in sensation and perception that any item on one side of the interface can be mapped to any item on the other side of the interface. Spatial and temporal and other ordering relationships tend to be preserved across the interfaces, and this strongly constrains the mapping of individual items. Remarkably, this is true even in synesthesia, see Plate 9 from Cytowic 2018.

3. TS: "That's all fine, but I am talking about the mind, not about the brain. Whatever the wirings in the brain, they won't tell us anything about how cognitive items of two distinct vocabularies are related (Vocabulary Insertion), or how a real-world item is associated to a cognitive category (wave length - color)."

I am not a dualist, and I doubt that this blog is a good forum for a discussion of the merits of mind/body dualism. Here is a quote from Chomsky 1983 on the mind/brain, he reiterates this in Chomsky 2005:257 and in many other places.

"Now, I think that there is every reason to suppose that the same kind of “modular” approach is appropriate for the study of the mind — which I understand to be the study, at an appropriate level of abstraction, of properties of the brain ..."

Just to be clear, I am not saying that cognitive scientists should defer to neuroscientists, but they should talk to them. The idea that we have learned nothing about color perception and cognition from the study of the human visual pathway is simply false.

4. TS: "is there evidence for interfaces that are not list-based?"

Yes, almost any (non-linguistic) set of items with an ordering relation. When aspects of the ordering relation are preserved across the interface the mapping will be a quasimorphism, and thus the item-to-item mappings will be strongly constrained by this, that is, if a < b then f(a) <_f f(b). What's unusual about the lexicon is that small changes in pronunciation can lead to enormous changes in meaning. In many of the other cases we instead end up with a very small, almost trivial look-up table, something like the sets of basis vectors for the two spaces, as with homomorphisms between groups in algebra.

5. TS: "is there evidence for associations that correspond to partial veridicality, i.e. where the to-be-related items are commensurable, i.e. allow for the assessment of similarity?" ...
"The same goes for the association of real-world items with cognitive categories: trying to assess the (dis)similarity of "450-485 nm" and "blue", as opposed to, say, "450-485 nm" and "red" (or any other perceived color for that matter) is pointless. Wave lengths and perceived colors are incommensurable and you won't be able to tell whether the match is veridical, non-veridical or partially veridical."

This isn't pointless at all. In fact, remarkable progress has been made in this area. See, for example, Hardin 1988, Hardin & Maffi 1997, Palmer 1999 and Bird et al 2014. The match is partially veridical in a variety of ways. Small changes in spectral composition generally lead to small changes in perceived hue; the mapping is a quasimorphism. Importantly, the topology of the representation changes -- and thus is a non-veridical aspect of the mapping, from a linear relation to a circular one in the cone cells of the retina to an opponent process representation in LGN.

6. TS: "The secret key is the look-up table that matches items of the two modules."

I agree with this, except that I want the look-up table to be as small as possible, the "basis vectors" for the spaces. In my opinion, the best way to accomplish this is with innate initial look-up tables for the features, giving the learner the initial conditions for the Memory-Action and Perception-Memory mappings. The feature-learning approaches, including Mielke 2008, Dresher 2014 and Odden 2019, start with an ability to perceive IPA-like phonetic representations. I simply don't believe that this is a plausible idea, given how difficult even simple cases are for such an approach, as explained in Dillon, Dunbar & Idsardi 2013.

References:

Bird CM, Berens SC, Horner AJ & Franklin A. 2014. Categorical encoding of color in the brain. Proceedings of the National Academy of Sciences, 111(12), 4590–4595.

Chomsky N. 1983. The Psychology of Language and Thought: Noam Chomsky interviewed by Robert W. Rieber. In RW Rieber (ed) Dialogues on the Psychology of Language and Thought. Plenum.

Chomsky N. 2005. Reply to Lycan. In LM Antony & N Hornstein (eds) Chomsky and his Critics. Blackwell.

Cytowic RE. 2018. Synesthesia. MIT Press.

Dillon B, Dunbar E & Idsardi WJ. 2013. A single-stage approach to learning phonological categories: insights from Inuktitut. Cognitive Science, 37(2), 344–377.

Hardin CL. 1988. Color for Philosophers: Unweaving the Rainbow. Hackett.

Hardin CL & Maffi L. 1997. Color Categories in Thought and Language. Cambridge University Press.

Palmer SE. 1999. Vision Science: Photons to Phenomenology. MIT Press.

Wednesday, April 24, 2019

ECoG to speech synthesis

In Nature today, another fascinating article from Eddie Chang's lab at UCSF. They were able to synthesize intelligible speech from ECoG recordings of cortical activity in sensory-motor and auditory areas. The system was even able to decode and synthesize speech successfully from silently mimed speech. The picture (Figure 1 in the article) shows a block diagram of the system.

There are also two commentaries on the work, along with some speech samples from the system.

Friday, April 19, 2019

A possible EFP developmental trajectory from syllables to segments

Infants can show a puzzling range of abilities and deficits in comparison with adults, out-performing adults on many phonetic perception tasks while lagging behind in other ways. Some null results using one procedure can be overturned with more sensitive procedures and some contrasts are "better" than others in terms of effect size and various acoustic or auditory measures of similarity (Sundara et al 2018). And there are other oddities about the infant speech perception literature, including the fact that the syllabic stimuli generally need to be much longer than the average syllable durations in adult speech (often twice as long). One persistent idea is that infants start with a syllable-oriented perspective and later move to a more segment-oriented one (Bertoncini & Mehler 1981), and that in some languages adults still have a primary orientation for syllables, at least for some speech production tasks (O'Seaghdha et al 2010; but see various replies, e.g. Qu et al 2012).

More than a decade ago, I worked with Rebecca Baier and Jeff Lidz to try to investigate audio-visual (AV) integration in 2 month old infants (Baier et al 2007). Infants were presented with one audio track along with two synchronized silent movies of the same person (namely Rebecca) presented on a large TV screen. The movies were of different syllables being produced; the audio track generally matched one of the movies. Using this method we were able to replicate the results of Kuhl & Meltzoff 1982 that two month old infants are able to match faces and voices among /a/, /i/, and /u/. Taking this one step further, we were also able to show that infants could detect dynamic syllables, matching faces with for example /wi/ vs. /i/. We did some more poking around with this method, but got several results that were difficult to understand. One of them was a failure of the infants to match on /wi/ vs /ju/. (And we are pretty sure that "we" and "you" are fairly frequently heard words for English learning infants.) Furthermore, when they were presented with /ju/ audio alongside /wi/ and /i/ faces, they matched the /ju/ audio with the /wi/ video. This behavior is at least consistent with a syllable-oriented point of view: they hear a dynamic syllable with something [round] and something [front] in it, but they cannot tell the relative order of [front] and [round]. This also seems consistent with the relatively poor abilities of infants to detect differences in serial order (Lewkowicz 2004). Rebecca left to pursue electrical engineering and this project fell by the wayside.

This is not to say that infants cannot hear a difference between /wi/ and /ju/, though. I expect that dishabituation experiments would succeed on this contrast. The infants would also not match faces for /si/ vs /ʃi/ but the dishabituation experiment worked fine on that contrast (as expected). So, certainly there are also task differences between the two experimental paradigms.

But I think that now we may have a way to understand these results more formally, using the Events, Features and Precedence model discussed on the blog a year ago, and developed more extensively in Papillon 2018. In that framework, we can capture a /wi~ju/ syllable schematically as (other details omitted):

The relative ordering of [front] and [round] is underspecified here, as is the temporal extent of the events. The discrimination between /wi/ and /ju/ amounts to incorporating the relative ordering of [front] and [round], that is, which of the dashed lines is needed in:

When [round] precedes [front], that is the developing representation for /wi/; when [front] precedes [round] that is the developing representation for /ju/. Acquiring this kind of serial order knowledge between different features might not be that easy as it is possible that [front] and [round] are initially segregated into different streams (Bregman 1990) and order perception across streams is worse than that within streams. It's conceivable that the learner would be driven to look for additional temporal relations when the temporally underspecified representations incorrectly predict such "homophony", akin to hashing collision.

If we pursue this idea more generally, the EFP graphs will gradually become more segment oriented as additional precedence relations are added, as in:

And if we then allow parallel, densely-connected events to fuse into single composite events we can get even closer to a segment oriented representation:

So the general proposal would be that the developing representation of the relative order of features is initially rather poor and is underspecified for order between features in different "streams". Testing this is going to be a bit tricky though. An even more general conclusion would be that features are not learned from phonetic segments (Mielke 2008) but that features are gradually combined during development to form segment sized units. We could also include other features encoding extra phonetic detail to these developing representations, it could then be the case that different phonetic features have different temporal acuity for the learners, and so cohere with other features to different extents.

References

Baier, R., Idsardi, W. J., & Lidz, J. (2007). Two-month-olds are sensitive to lip rounding in dynamic and static speech events. In Proceedings of the International Conference on Auditory-Visual Speech Processing.

Bertoncini, J., & Mehler, J. (1981). Syllables as units in infant speech perception. Infant behavior and development, 4, 247-260.

Bregman, A. (1990). Auditory Scene Analysis. MIT Press.

Kuhl, P.K., and Meltzoff, A.N. (1982). The Bimodal perception of speech in infancy. Science, 218, 1138-1141.

Lewkowicz, D. J. (2004). Perception of serial order in infants. Developmental Science, 7(2), 175–184.

Mielke, J. (2008). The Emergence of Distinctive Features. Oxford University Press.

O’Seaghdha, P. G., Chen, J. Y., & Chen, T. M. (2010). Proximate units in word production: Phonological encoding begins with syllables in Mandarin Chinese but with segments in English. Cognition, 115(2), 282-302.

Papillon, M. (2018). Precedence Graphs for Phonology: Analysis of Vowel Harmony and Word Tones. ms.

Qu, Q., Damian, M. F., & Kazanina, N. (2012). Sound-sized segments are significant for Mandarin speakers. Proceedings of the National Academy of Sciences, 109(35), 14265-14270.

Sundara, M., Ngon, C., Skoruppa, K., Feldman, N. H., Onario, G. M., Morgan, J. L., & Peperkamp, S. (2018). Young infants’ discrimination of subtle phonetic contrasts. Cognition, 178, 57–66.

Tuesday, April 9, 2019

Two cipher examples

I'm not convinced that I'm getting my point about "arbitrary" across, so maybe we should try some toy examples from a couple of ciphers. Let's encipher "Colorless green ideas" in a couple of ways.

1. Rot13: "Colorless green ideas" ⇒ "Pbybeyrff terra vqrnf". This method is familiar to old Usenet denizens. It makes use of the fact that the Latin alphabet has 26 letters by rotating them 13 places (a⇒n, b⇒o, ... m⇒z, n⇒a, o⇒b, ...) and so this method is its own inverse. That is, you decode a rot13 message by doing rot13 on it a second time. This is a special case of a Caesar cipher. Such ciphers are not very arbitrary as they mostly preserve alphabetic letter order, but they "wrap" the alphabet around into a circle (like color in the visual system) with "z" being followed by "a". In a rotation cipher, once you figure out one of the letter codes, you've got them all. So if "s" maps to "e" then "t" maps to "f" and so on.

2. Scrambled alphabet cipher: Randomly permute the 26 letters to other letters, for example A..Z ⇒ PAYRQUVKMZBCLOFSITXJNEHDGW. This is a letter-based codebook. This is arbitrary, at least from the individual letter perspective, as it won't preserve alphabetic order, encoding "Colorless" as "Yfcftcqxx". So knowing one letter mapping (c⇒y) won't let you automatically determine the others.

But this cipher does preserve various other properties, such as capitalization, number of distinct atomic symbols, spaces between words, message length, doubled letters, and sequential order in general.

Even word-based code books tend to preserve sequential order. That is, the message is input word-by-word from the beginning of the message to the end. But more sophisticated methods are possible, for example by padding the message with irrelevant words. It's less common to see the letters of the individual words scrambled, but we could do that for words of varying lengths, say by having words of length 2 reversed, 21, so that "to" would be encoded as "to" ⇒ "ot" ⇒ "fj". And words of length three might be scrambled 312, length four as 2431, and so on, choosing a random permutation for each word length. Adding this encryption technique will break apart some doubled letters. But the word order would still be preserved across the encryption.

These toy examples are just to show that "arbitrary" vs "systematic" isn't an all-or-nothing thing in a mapping. You have to consider all sorts of properties of the input and output representations and see which properties are being systematically preserved (or approximately preserved) across the mapping, and which are not. Temporal relations (like sequencing) are particularly important in this respect.

Saturday, April 6, 2019

ℝeℤolution

Chabot 2019:

"The notion that phonetic realizations of phonological objects function in an arbitrary fashion is counterintuitive at best, confounding at worst. However, order is restored to both phonology and phonetics if a modular theory of mind (Fodor 1983) is considered. In a modular framework, cognition is viewed as work carried out by a series of modules, each of which uses its own vocabulary and transmits inputs and outputs to other modules via interfaces known as transducers (Pylyshyn 1984; Reiss 2007), and the relationship between phonetics and phonology must be arbitrary. This formalizes the intuition that phonology deals in the discrete while phonetics deals in the continuous. A phonological object is an abstract cognitive unit composed of features or elements, with a phonetic realization that is a physical manifestation of that object located in time and space, which is composed of articulatory and perceptual cues." [italics in original, boldface added here]

The implication seems to be that any relation between discrete and continuous systems is "arbitrary". However, there are non-arbitrary mappings between discrete and continuous systems. The best known is almost certainly the relationship between the integers (ℤ) and the reals (ℝ). There is a homomorphism (and only one) from ℤ into ℝ, and it is the obvious one that preserves addition (and lots of other stuff). Call this H. That is, H maps {... -1, 0, 1 ...} in ℤ to {... -1.0, 0.0, 1.0 ...} in ℝ (using the C conventions for ints and floats). Using + for addition over ℤ and +. for addition over ℝ, then H also takes + to +. (that is, we need to say what the group operation in each case is, this is important when thinking about symmetry groups for example). So now it is true that for all i, j in ℤ H(i + j) = H(i) +. H(j).

However, mapping from ℝ onto ℤ (quantization, Q) is a much trickier business. One obvious technique is to map the elements of ℝ to the nearest integer (i.e. to round them off). But this is not a homomorphism because there are cases where for some r, s in ℝ, Q(r +. s) ≠ Q(r) + Q(s), for example Q(1.6 +. 1.6) = Q(3.2) = 3, but Q(1.6) + Q(1.6) = 2 + 2 = 4. So the preservation of addition from ℝ to ℤ is only partial.

References

Chabot A 2019. What’s wrong with being a rhotic?. Glossa, 4(1), 38. DOI: http://doi.org/10.5334/gjgl.618

Thursday, April 4, 2019

Felinology

(In memory of Felix d. 1976, Monty d. 1988, Jazz d. 1993)

In Nature today, some evidence that cats can distinguish their own names. The cats were tested in their homes using a habituation-dishabituation method. This is in contrast to dogs, who have been tested using retrieval tasks, because "the training of cats to perform on command would require a lot of effort and time." From a quick scan of the article, it isn't clear if the foils for the names were minimal pairs though.

Call for papers: Melodic primes in phonology

From Alex and Tobias:

Special issue of Canadian Journal of Linguistics/Revue canadienne de linguistique
Call for papers

We are calling for high-quality papers addressing the status of melodic primes in phonology, in particular in substance-free phonology frameworks. That is, do phonological primes bear phonetic information, if so how much and in which guise exactly? How are melodic primes turned into phonetic objects? In the work of Hale & Reiss, who have coined the term substance-free phonology, it is only phonological computation which is unimpacted by phonetic substance, though it is, however, present in the phonology: melodic primes are still phonetic in nature, and their phonetic content determines how they will be realized as phonetic objects. We are interested in arguments which argue for the presence of phonetic information in melodic primes as well as an alternative position which sees melodic primes as being entirely void of phonetic substance.

At the recent Phonological Theory Agora in Nice, there was some discussion regarding the implications a theory of substance-free melodic primes has for phonology; a variety of frameworks – including Optimality Theory, Government Phonology, and rule based approaches – have all served as a framework for theories which see melodic primes as entirely divorced from phonetic information. The special issue seeks to high-light some of those approaches, and is intended to spark discussion between advocates of the various positions and discussion between practitioners of different frameworks.

We are especially interested in the implications a theory of substance-free primes has for research in a number of areas central to phonological theory, including: phonological representations, the acquisition of phonological categories, the form of phonological computation, the place of marginal phenomena such as “crazy rules” in phonology, the meaning of markedness, the phonology of signed languages, the nature of the phonetics/phonology interface, and more. Substance-free primes also raise big questions related to the question of emergence: are melodic primes innate or do they emerge through usage? How are phonological patterns acquired if primes are not innate?

As a first step, contributors are asked to submit a two page abstract to the editor at alexander.chabot@univ-cotedazur.fr

Contributions will be evaluated based on relevance for the special issue topic, as well as the overall quality and contribution to the field. Contributors of accepted abstracts will be invited to submit a full paper, which will undergo the standard peer review process. Contributions that do not fulfill the criteria for this special issue can, naturally, still be submitted to the Canadian Journal of Linguistics/Revue canadienne de linguistique.

Timeline:
(a) June 1, 2019: deadline for abstracts, authors notified by July
(b) December 2019: deadline for first submission
(c) January 2020: sending out of manuscripts for review
(d) March 2020: completion of the first round of peer review
(e) June 2020: deadline revised manuscripts
(f) August 2020: target date for final decision on revised manuscripts
(g) October 2020: target date for submission of copy-edited manuscripts
(h) CJL/RCL copy-editing of papers
(i) End of 2020: Submission of copy-edited papers to Cambridge University Press (4 months before publication date).

Wednesday, April 3, 2019

Dueling Fodor interpretations

Bill Idsardi

Alex and Tobias from their post:

"The ground rule of (Fodorian) modularity is domain specificity: computational systems can only parse and compute units that belong to a proprietary vocabulary that is specific to the system at hand."

and

"Hence Hale & Reiss' statement that nothing can be parsed by the cognitive system that wasn't present at birth (or that the cognitive system does not already know) appears to be just incorrect. Saying that unknown stimulus can lead to cognitive categories everywhere except in phonology seems a position that is hard to defend."

I think both parties here are invoking Fodor, but with different emphases. Alex and Tobias are cleaving reasonably close to Fodor 1983 while Charles and Mark are continuing some points from Fodor 1980, 1998.

But Fodor is a little more circumspect than Alex and Tobias about intermodular information transfer:

Fodor 1983:46f: "the input systems are modules ... I imagine that within (and, quite possibly, across)[fn13] the traditional modes, there are highly specialized computational mechanisms in the business of generating hypotheses about the distal sources of proximal stimulations. The specialization of these mechanisms consists in constraints either on the range of information they can access in the course of projecting such hypotheses, or in the range of distal properties they can project such hypotheses about, or, most usually, on both."

"[fn13] The "McGurk effect" provides fairly clear evidence for cross-modal linkages in at least one input system for the modularity of which there is independent evidence. McGurk has demonstrated that what are, to all intents and purposes, hallucinatory speech sounds can be induced when the subject is presented with a visual display of a speaker making vocal gestures appropriate to the production of those sounds. The suggestion is that (within, presumably, narrowly defined limits) mechanisms of phonetic analysis can be activated by -- and can apply to -- either acoustic or visual stimuli. It is of central importance to realize that the McGurk effect -- though cross-modal -- is itself domain specific -- viz., specific to language. A motion picture of a bouncing ball does not induce bump, bump, bump hallucinations. (I am indebted to Professor Alvin Liberman both for bringing McGurk's results to my attention and for his illuminating comments on their implications.)" [italics in original]

I think this quote deserves a slight qualification, as there is now quite a bit of evidence for multisensory integration in the superior temporal sulcus (e.g. Noesselt et al 2012). As for "bump, bump, bump", silent movies of people speaking don't induce McGurk effects either. The cross-modal effect is broader than Fodor thought too, as non-speech visual oscillations that occur in phase with auditory oscillations do enhance brain responses in auditory cortex (Jenkins et al 2011).

To restate my own view again, to the extent that the proximal is partially veridical with the distal, such computational mechanisms are substantive (both the elements and the relations between elements). The best versions of such computational mechanisms attempt to minimize both substance (the functions operate over a minimum number of variables about distal sources; they provide a compact encoding) and arbitrariness (the "dictionary" is as small as possible; it contains just the smallest fragments that can serve as a basis for the whole function; the encoding is compositional and minimizes discontinuities).

And here's Fodor on the impossibility of inventing concepts:

Fodor 1980:148: "Suppose we have a hypothetical organism for which, at the first stage, the form of logic instantiated is propositional logic. Suppose that at stage 2 the form of logic instantiated is first-order quantificational logic. ... Now we are going to try to get from stage 1 to stage 2 by a process of learning, that is, by a process of hypothesis formation and confirmation. Patently, it can't be done. Why? ... [Because] such a hypothesis can't be formulated with the conceptual apparatus available at stage 1; that is precisely the respect in which propositional logic is weaker than quantificational logic."

Fodor 1980:151: "... there is no such thing as a concept being invented ... It is not a theory of how you acquire concepts, but a theory of how the environment determines which parts of the conceptual mechanism in principle available to you are in fact exploited." [italics in original]

You can select or activate a latent ability on the basis of evidence and criteria (the first order analysis might be much more succinct than the propositional analysis) but you can't build first order logic solely out of the resources of propositional logic. You have to have first order logic already available to you in order for you to choose it.

References

Fodor JA 1980. On the impossibility of acquiring "more powerful" structures. Fixation of belief and concept acquisition. In M Piattelli-Palmarini (ed.) Language and Learning: The Debate between Jean Piaget and Noam Chomsky. Harvard University Press. 142-162.

Fodor JA 1983. Modularity of Mind. MIT Press.

Fodor JA 1998. Concepts: Where Cognitive Science went Wrong. Oxford University Press.

Jenkins J, Rhone AE, Idsardi WJ, Simon JZ, Poeppel D 2011. The Elicitation of Audiovisual Steady-State Responses: Multi-Sensory Signal Congruity and Phase Effects. Brain Topography, 24(2), 134–148.

Noesselt T, Bergmann D, Heinze H-J, Münte T, Spence C 2012. Coding of multisensory temporal patterns in human superior temporal sulcus. Frontiers in Integrative Neuroscience, 6, 64.

Sunday, March 31, 2019

Seeing with your tongue

EM: You refuse to look through my telescope.
TK (gravely): It's not a telescope, Errol. It's a kaleidoscope.
Exchange recounted in The Ashtray by Errol Morris (p12f)

Alex and Tobias in their post:

"At a more general cognitive level, we know positively that the human brain/mind is perfectly able to make sense of sensory input that was never encountered and for sure is not innate. Making sense here means "transform a sensory input into cognitive categories". There are multiple examples of how electric impulses have been learned to be interpreted as either auditive or visual perception: cochlear implants on the one hand, so-called artificial vision, or bionic eye on the other hand. The same goes for production: mind-controlled prostheses are real."

The nervous system can certainly "transform a sensory input into cognitive categories"; the question is how wild these transformations (transductions, interfaces) can be. No surprise, I'm going to say that they are highly constrained and therefore not fully arbitrary, basically limited to quasimorphisms. In the case of (visual) geometry, I think that we can go further and say that they are constrained to affine transformations and radial basis functions.

One of the better known examples of a neuromorphic sensory prosthetic device is the Brainport, an electrode array which sits on the surface of the tongue ( https://www.youtube.com/watch?v=48evjcN73rw ). The Brainport is a 2D sensor array, and so there is a highly constrained geometric relationship between the tongue-otopic coordinate system of the Brainport and the retinotopic one. As noted in the Wikipedia article, "This and all types of sensory substitution are only possible due to neuroplasticity." But neuroplasticity is not total, as shown by the homotopically limited range of language reorganization (Tivarus et al 2012).

So the thought experiment here consists of thinking about stranger tongue-otopic arrangements and whether they would work in a future Brainport V200 device.

1. Make a "funhouse" version of the Brainport. Flip the vertical and/or horizontal dimensions. This would be like wearing prismatic glasses. Reflections are affine transformations. This will work.

2. Make a color version of the Brainport. Provide three separate sensor arrays, one for each of the red, green and blue wavelengths. In the retina the different cone types for each "pixel" are intermixed (spatially proximate), in the color Brainport they wouldn't be. We would be effectively trying to use an analog of stereo vision computation (but with 3 "eyes") to do color registration and combination. It's not clear whether this would work.

3. Make a "kaleidoscope" version of the Brainport. Randomly connect the camera pixel array with the sensor array, such that adjacent pixels are no longer guaranteed to be adjacent on the sensor array. The only way to recover the adjacency information is via a (learned) lookup table. This is beyond the scope of affine transformations and radial basis functions. This will not work.

References

Liu S-C, Delbruck T 2010. Neuromorphic sensory systems. Current Opinion in Neurobiology, 20(3), 288–295.

Tivarus ME, Starling SJ, Newport EL, Langfitt JT 2012. Homotopic language reorganization in the right hemisphere after early left hemisphere injury. Brain and Language, 123(1), 1–10.

Friday, March 29, 2019

More on "arbitrary"

Bill Idsardi

Alex and Tobias have upped the ante, raised the stakes and doubled down in the substance debate, advocating a "radical substance-free" position in their post.

I had been pondering another post on this topic myself since reading Omer's comment on his blog "a parallel, bi-directional architecture is literally the weakest possible architectural assumption". So I guess Alex and Tobias are calling my bluff, and I need to show my cards (again).

So I agree that "substance abuse" is bad, and I agree that minimization of substantive relationships is a good research tactic, but "substance-free" is at best a misnomer, like this "100% chemical free hair dye" which shoppers assume isn't just an empty box. A theory lacking any substantive connection with the outside world would be a theory about nothing.

And there's more to the question of "substance" than just entities, there are also predicates and relations over those entities. If phonology is a mental model for speech then it must have a structure and an interpretation, and the degree of veridicality in the interpretation of the entities, predicates and relations is the degree to which the model is substantive. Some truths about the entities, predicates and relations in the outside world will be reflected in the model, that's its substance. The computation inside the model may be encapsulated, disconnected from events in the world, without an interesting feedback loop (allowing, say, for simulations and predictions about the world) but that's a separate concept.

As in the case discussed by Omer, a lot of the debate about "substance" seems to rest on architectural and interface assumptions (with the phonology-phonetics-motor-sensory interfaces often termed "transducers" with nods to sensory transducers, see Fain 2003 for an introduction). The position taken by substance-free advocates is that the mappings achieved by these interfaces/transducers (even stronger, all interfaces) are arbitrary, with the canonical example being a look-up table, as exhibited by the lexicon. For example, from Scheer 2018:

“Since lexical properties by definition do not follow from anything (at least synchronically speaking), the relationship between the input and the output of this spell-out is arbitrary: there is no reason why, say, -ed, rather than -s, -et or -a realizes past tense in English.
The arbitrariness of the categories that are related by the translational process is thus a necessary property of this process: it follows from the fact that vocabulary items on either side cannot be parsed or understood on the other side. By definition, the natural locus of arbitrariness is the lexicon: therefore spell-out goes through a lexical access.
If grammar is modular in kind then all intermodular relationships must instantiate the same architectural properties. That is, what is true and undisputed for the upper interface of phonology (with morpho-syntax) must also characterize its lower interface (with phonetics): there must be a spell-out operation whose input (phonological categories) entertain an arbitrary relationship with its output (phonetic categories).” [italics in original, boldface added here]

Channeling Omer then, spell-out via lookup table is literally the weakest possible architectural assumption about transduction. A lookup table is the position of last resort, not the canonical example. Here's Gallistel and King (2009: xi) on this point:

“By contrast, a compact procedure is a composition of functions that is guaranteed to generate (rather than retrieve, as in table look-up) the symbol for the value of an n-argument function, for any arguments in the domain of the function. The distinction between a look-up table and a compact generative procedure is critical for students of the functional architecture of the brain.”

I think it may confuse some readers that Gallistel and King talk quite a bit about lookup tables, but they do say "many functions can be implemented with simple machines that are incomparably more efficient than machines with the architecture of a lookup table" (p. 53).

Jackendoff 1997:107f (who advocates a parallel, bi-directional architecture of the language faculty by the way) struggles to find analogs to the lexicon:

"One of the hallmarks of language, of course, is the celebrated "arbitrariness of the sign," the fact that a random sequence of phonemes can refer to almost anything. This implies, of course, that there could not be language without a lexicon, a list of the arbitrary matches between sound and meaning (with syntactic properties thrown in for good measure).
If we look at the rest of the brain, we do not immediately find anything with these same general properties. Thus the lexicon seems like a major evolutionary innovation, coming as if out of nowhere."

Jackendoff then goes on to offer some possible examples of lexicon-like associations: vision with taste ("mashed potatoes and French vanilla ice cream don't look that different") and skilled motor movements like playing a violin or speaking ("again it's not arbitrary, but processing is speeded up by having preassembled units as shortcuts.") But his conclusion ("a collection of stored associations among fragments of disparate representations") is that overall "it is not an arbitrary mapping".

As I have said before, in my opinion a mapping has substance to the extent that it has partial veridicality. (Max in the comments to the original post prefers "motivated" to what I called "non-arbitrary" but see Burling who draws a direct opposition between "motivated" and "arbitrary".)

So I have two points to re-emphasize about partial veridicality: it's partial and it displays some veridicality.

Partially, not completely, veridical

This is the easy part, and the linguists all get this one. (But it was a continuing source of difficulty for some neuro people in my grad cogsci course over the years.) The sensory systems of animals are limited in dynamic range and in many other ways. The whole concept of a “just noticeable difference” means that there are physical differences that are below the threshold of sensory detection. The fact that red is next to violet on the color wheel is also an example of a non-veridical aspect of color perception.

These are relatively easy because they are a bit like existence proofs. We just need to find some aspect of the system that breaks a relationship at a single point across the interface. Using T to represent transduction, we need to find a relation R such that R(x,y) holds but TR(Tx,Ty) does not hold everywhere or vice versa. In the color wheel example the "external" relation is wavelength distance, and the "internal" relation is perceptual hue similarity; violet is perceptually similar to red even though the wavelength of violet is maximally distant from red in the visible spectrum. (But otherwise wavelength distance is a good predictor of perceptual similarity.) And this same argument extends to intermodular relationships within the visual system, as in the mapping between the RGB hue representation in the retina and the R/G-Y/B opponent process representation in the lateral geniculate nucleus.

Partially, not completely, arbitrary

I am never forget the day

I am given first original paper to write

It was on analytical algebraic topology

Of locally Euclidean metrization

Of infinitely differentiable Riemannian manifold

Боже мой!

This I know from nothing

Tom Lehrer, "Lobachevsky"

This is somewhat harder to think about because one has to imagine really crazy functions (i.e. arbitrary functions in the mathematical sense, full lookup table functions). To put my cards on the table, I don't believe sensory transducers are capable of computing arbitrary functions (the place to look for this would be the olfactory system). I think they are limited to quasimorphims, capable of making some changes in topology (e.g. line to circle in color vision) but the functions are almost everywhere differentiable, offering a connection with manifold learning (Jansen & Niyogi 2006, 2013). I think Gallistel and King (2009: x) have pretty much the same view (though I think "homomorphism" is slightly too strong):

“Representations are functioning homomorphisms. They require structure-preserving mappings (homomorphisms) from states of the world (the represented system) to symbols in the brain (the representing system). These mappings preserve aspects of the formal structure of the world.”

So here's another bumper sticker slogan: preserved structure is substance.

It's homomorphic not isomorphic so the structure is not completely preserved (it's only partially veridical). But it doesn't throw out all the structure, which includes not just entities but also relationships among entities.

A small example of this sort can be found in Heffner et al 2019. Participants were asked to learn new categories, mappings between sounds and colors, with the sounds drawn from a fricative continuum between [x] and [ç] (1-10), and the associated colors drawn from the various conditions shown in the figure.

I don't think it should come as much of a surprise that "picket fence" and "odd one out" are pretty hard for people to learn. So the point here is that there is structure in the learning mechanism; mappings with fewer discontinuities are preferred.

Here's a similar finding from gerbils (Ohl 2001, 2009):

Ohl et al 2009: "Animals trained on one or more training blocks never generalized to pure tones of any frequency (e.g. start or stop frequencies of the modulated tone, or frequencies traversed by the modulation or extrapolated from the modulation). This could be demonstrated by direct transfer experiments (Ohl et al 2001, supplementary material) or by measuring generalization gradients for modulation rate which never encompassed zero modulation rates (Ohl et al 2001)." [pure tones have a zero modulation rate -- WJI]

That is, the gerbils don't choose a picket fence interpretation either, although that would work here, based on the starting frequency of the tone. Instead, they find the function with the fewest discontinuities that characterizes the data, based on their genetic endowment of spectro-temporal receptive fields (STRFs) in their primary auditory cortex. They don't get to invent new STRFs, let alone create arbitrary ones. The genetic endowment provides the structure for the sensory transductions, and thus some functions are learnable while many are not. So the resulting functions are partially, but not completely arbitrary. And they have a limited number of discontinuities.

By the way, exemplar (instance-based) learning models have no trouble with picket fence arrangements, learning them as quickly as they learn the other types.

OK, I think that's enough for now. I'll address my take on the relative priority of features and segments in another post.

References

Fain GL 2003. Sensory Transduction. Sinauer.

Gallistel CR, King AP 2009. Memory and the Computational Brain. Wiley-Blackwell.

Heffner CC, Idsardi WJ, Newman RS 2019. Constraints on learning disjunctive, unidimensional auditory and phonetic categories. Attention, Perception & Psychophysics. https://doi.org/10.3758/s13414-019-01683-x

Jackendoff R 1997. The Architecture of the Language Faculty. MIT Press.

Jansen A, Niyogi P 2006. Intrinsic Fourier analysis on the manifold of speech sounds. IEEE ICASSP. Retrieved from https://ieeexplore.ieee.org/abstract/document/1660002/

Jansen A, Niyogi P 2013. Intrinsic Spectral Analysis. IEEE Transactions on Signal Processing, 61(7), 1698–1710.

Ohl FW, Scheich H, Freeman WJ 2001. Change in pattern of ongoing cortical activity with auditory category learning. Nature, 412(6848), 733–736.

Ohl FW, Scheich H 2009. The role of neuronal populations in auditory cortex for category learning. In Holscher C, Munk M (Eds.) Information Processing by Neuronal Populations. Cambridge University Press. 224-246.

Scheer T 2018. The workings of phonology and its interfaces in a modular perspective. In Annual conference of the Phonological Society of Japan. phsj.jp. Retrieved from http://phsj.jp/PDF/abstract_Scheer_forum2018.pdf

Wednesday, March 27, 2019

That 32GB flashdrive? It's 20,000 people worth of language acquisition information

Today in the Royal Society Open Science:

Mollica F, Piantadosi ST. 2019. Humans store about 1.5 megabytes of information during language acquisition. R. Soc. open sci. 6: 181393. http://dx.doi.org/10.1098/rsos.181393

Tuesday, March 26, 2019

A new player has entered the game

Here is a guest post from Alex Chabot and Tobias Scheer picking up a thread from about a year ago now. Bill

Alex Chabot & Tobias Scheer

What it is that is substance-free: computation and/or melodic primes

A late contribution to the debate...
In his post from April 12th, 2018, Veno has clarified his take on the status of melodic primes (features) in phonology (which is identical with the one exposed in the work by Hale & Reiss since 2000). The issue that gave rise to some misunderstanding and probably misconception about the kind of primes that Hale-Reiss-Volenec propose concerns their substance-free status: which aspect of them is actually substance-free and which one is not? This is relevant because the entire approach initiated by Hale & Reiss' 2000 LI paper has come to be known as substance-free.

Veno has thus made explicit that phonological features in his view are substance-laden, but that this substance does not bear on phonological computation. That is, phonological features bear phonetic labels ([labial], [coronal] etc.) in the phonology, but phonological computation ignores them and is able to turn any feature set into any other feature set in any context and its reverse. This is what may be called substance-free computation (i.e. computation that does not care for phonetics). At the same time, Veno explains, the phonetic information carried by the features in the phonology is used upon externalization (if we may borrow this word for phonological objects): it defines how features are pronounced (something called transduction by Hale-Reiss-Volenec, or phonetic implementation system PIS in Veno's post). That is, phonological [labial] makes sure that it comes out as something phonetically labial (rather than, say, dorsal). The correspondence between the phonological object and its phonetic exponent is thus firmly defined in the phonology - not by the PIS device.

The reason why Hale & Reiss (2003, 2008: 28ff) have always held that phonological features are substance-laden is learnability: they contend that cognitive categories cannot be established if the cognitive system does not know beforehand what kind of sensory input will come its way and relates to the particular category ("let's play cards"). Hence labiality, coronality etc. would be unparsable noise for the L1 learner did they not know at birth what labiality, coronality etc. is. Therefore, Hale-Reiss-Volenec conclude, substance-laden phonological features are universal and innate.

We believe that this take on melodic primes is misled (we talk about melodic primes since features are the regular currency, but there are also approaches that entertain bigger, holistic primes, called Elements. Everything that is said about features also applies to Elements). The alternative to which we subscribe is called radical substance-free phonology, where "radical" makes the difference with Hale-Reiss-Volenec: in this view both phonological computation and phonological primes are substance-free. That is, phonology is really self-contained in the Saussurian sense: no phonetic information is present (as opposed to: present but ignored). Melodic primes are thus alphas, betas and gammas: they assure contrast and infra-segmental decomposition that is necessary independently. They are related to phonetic values by the exact same spell-out procedure that is known from the syntax-phonology interface: vocabulary X is translated into vocabulary Y through a lexical access (Scheer 2014). Hence α ↔ labiality (instead of [labial] ↔ labiality).

1. Formulations

To start, the misunderstanding that Veno had the good idea to clarify was entertained by formulations like:

"[w]e understand distinctive features here as a particular kind of substance-free units of mental representation, neither articulatory nor acoustic in themselves, but rather having articulatory and acoustic correlates." Reiss & Volenec (2018: 253, emphasis in original)

Calling features substance-free when they are actually substance-laden is probably not a good idea. What is meant is that phonological computation is substance-free. But the quote talks about phonological units, not computation.

2. Incompatible with modularity

The ground rule of (Fodorian) modularity is domain specificity: computational systems can only parse and compute units that belong to a proprietary vocabulary that is specific to the system at hand. In Hale-Reiss-Volenec' view, phonological units are defined by extra-phonological (phonetic) properties, though. Hence given domain specificity phonology is unable to parse phonetically defined units such as [labial], [coronal] etc. Or else if "labial", "coronal" etc. are vocabulary items of the proprietary vocabulary used in phonological computation, this computation comprises both phonology and phonetics. Aside from the fact that there was enough blurring these boundaries in the past two decades or so and that Hale-Reiss-Volenec have expressed themselves repeatedly in favour of a clear modular cut between phonetics and phonology, the architecture of their system defines phonology and phonetics as two separate systems since it has a translational device (transduction, PIS) between them.

One concludes that phonological primes that are computed by phonological computation, but which bear phonetic labels (and in fact are not defined or differentiated by any other property), are a (modular) contradiction in terms.

To illustrate that, see what the equivalent would be in another linguistic module, (morpho‑)syntax: what would you say about syntactic primes such as number, animacy, person etc. which come along as "coronal", "labial" etc. without making any reference to number, animacy, person? That is, syntactic primes that are not defined by syntactic but by extra-syntactic (phonological) vocabulary? In this approach it would then be said that even though primes are defined by non-syntactic properties, they are syntactic in kind and undergo syntactic computation, which however ignores their non-syntactic properties.

This is but another way to state the common sense question prompted by a system where the only properties that phonological primes have are phonetic, but which are then ignored by phonological computation: what are the phonetic labels good for? They do not do any labour in the phonology, and they need to be actively ignored. Hale-Reiss-Volenec' answer was mentioned above: they exist because of learnability. This is what we address in the following point.

3. Learnability

Learnability concerns of substance-free melodic primes are addressed by Samuels (2012), Dresher (2018) and a number of contributions in Clements & Ridouane (2011). They are the focus of a recent ms by Odden (2019).

At a more general cognitive level, we know positively that the human brain/mind is perfectly able to make sense of sensory input that was never encountered and for sure is not innate. Making sense here means "transform a sensory input into cognitive categories". There are multiple examples of how electric impulses have been learned to be interpreted as either auditive or visual perception: cochlear implants on the one hand, so-called artificial vision, or bionic eye on the other hand. The same goes for production: mind-controlled prostheses are real. Hence Hale & Reiss' statement that nothing can be parsed by the cognitive system that wasn't present at birth (or that the cognitive system does not already know) appears to be just incorrect. Saying that unknown stimulus can lead to cognitive categories everywhere except in phonology seems a position that is hard to defend.

References

Clements, George N. & Rachid Ridouane (eds.) 2011. Where do Phonological Features come from? Cognitive, physical and developmental bases of distinctive speech categories. Amsterdam: Benjamins.

Dresher, Elan 2018. Contrastive Hierarchy Theory and the Nature of Features. Proceedings of the 35th West Coast Conference on Formal Linguistics 35: 18-29.

Hale, Mark & Charles Reiss 2000. Substance Abuse and Dysfunctionalism: Current Trends in Phonology. Linguistic Inquiry 31: 157-169.

Hale, Mark & Charles Reiss 2003. The Subset Principle in Phonology: Why the tabula can't be rasa. Journal of Linguistics 39: 219-244.

Hale, Mark & Charles Reiss 2008. The Phonological Enterprise. Oxford: OUP.

Odden, David 2019. Radical Substance Free Phonology and Feature Learning. Ms.

Reiss, Charles & Veno Volenec 2018. Cognitive Phonetics: The Transduction of Distinctive Features at the Phonology–Phonetics Interface. Biolinguistics 11: 251-294.

Samuels, Bridget 2012. The emergence of phonological forms. Towards a biolinguistic understanding of grammar: Essays on interfaces, edited by Anna Maria Di Sciullo, 193-213. Amsterdam: Benjamins.

Wednesday, March 13, 2019

Alec Marantz on the goals and methods of Generative Grammar

I always like reading papers aimed at non-specialists by leading lights of a specialty. This includes areas that I have some competence in. I find that I learn a tremendous amount from such non-technical papers for they self consciously aim to identify the big ideas that make an inquiry worth pursuing in the first place and the general methods that allow it to advance. This is why I always counsel students to not skip Chomsky's "popular" books (e.g. Language and Mind, Reflections on Language, Knowledge of Language, etc.).

Another nice (short) addition to this very useful literature is a paper by Alec Marantz (here): What do linguists do? Aside from giving a nice overview of how linguists work, it also includes a quick and memorable comment on Everett's (mis)understanding of his critique of GG. What Alec observes is that even if one takes Everett's claims entirely at face value empirically (which, one really shouldn't) his conclusion that Piraha is different in kind wrt the generative procedures it deploys from a language like English. Here is Alec:

His [Everett's, NH] analysis of Pirahã actually involves claiming Pirahã is just like every other language, except that it has a version of a mechanism that other languages use that, in Pirahã, limits the level of embedding of words within phrases.

I will let Alec explain the details, but what is important is that what he points out is that Everett confuses two very different issues that it is important to keep apart: what are the generative procedures that a given G deploys and what are the products of that procedure. Generative grammarians of the Chomsky stripe care a lot about the first question (what are the rule types that Gs can have). What Alec observes (and that Everett actually concedes in his specific proposal) is that languages that use the very same generative mechanisms can have very different products resulting. Who would have thunk it!

At any rate, take a look at Alec's excellent short piece. And while you are at it, you might want to read a short paper by another Syntax Master, Richie Kayne (here). He addresses terrific question beloved by both neophytes and professionals: how many languages are there. I am pretty sure that his reply will both delight and provoke you. Enjoy.

Tuesday, March 12, 2019

Dan Milway discusses Katz's semantic theory

Dan Milway has an interesting project: reading Jerrold Katz's semantic investigations and discussing them for/with/near us. Here are two urls that discusses the preface and chapter 1 of Katz's 1972 Semantic Theory. Other posts are promised. I like these archeological digs into earlier thoughts on still murky matters. I suspect you will too.

Omer on the autonomy of syntax; though you will be surprised what the autonomy is from!

Here is a post from Omer that bears on the autonomy issue. There are various conceptions of autonomy. The weakest is simply the claim that syntactic relations cannot be reduced to any others. The standard conception is that it might reduce to semantic generalizations or probabilistic generalizations over stings (hence the utility of 'Colorless green ideas sleep furiously'). There are, however, stronger versions that relate to how different kinds of information intersect in derivations. And this is what Omer discusses: do the facts dictate that we allow phonological/semantic information intersperse with syntactic information to get the empirical trains to run on time. Omer takes on a recent suggestion that this is required and, imo, shreds the conclusion. At any rate, enjoy!

Wednesday, March 6, 2019

More on non-academic jobs

Last week Norbert linked to a Nature article on non-academic careers. This week, Nature has another piece which offers very simple advice: talk to the people at the career center at your university. I did exactly this when I was finishing my PhD at MIT, and ended up interviewing for several non-academic research and development positions in industry.

I should also say that my advisor, Morris Halle, told me that I should try being a professor first because in his opinion it was easier to go from an academic career to a non-academic one. I'm not sure that's really true, but I took his advice, and I'm still working as a professor so far.

Saturday, March 2, 2019

Two articles in Inference this week

Juan reviews Language in Our Brain: The Origins of a Uniquely Human Capacity
by Angela Friederici.

Bob and Noam respond to critics.

Wednesday, February 27, 2019

When academic jobs are hard to get

When I first graduated with a PhD an academic job was not assured. Indeed, at the time (the mid 1970s into the the mid 1980s) MIT was sending out acceptance letters warning that academic jobs were not thick on the ground and though they could assure four wonderful years of intellectual ferment and excitement, whether these would be rewarded with an academic job at the end was quite unclear. This was their version of buyer beware.

If anything, my impression is that things have gotten worse. Even those that land decent jobs often do so after several years as Post Docs (not a bad gig actually, I had several) and even then people that have all the qualifications for academic appointment (i.e. had better CVs than me and my friends had when we entered the job market and landed positions) may not find anything at all. This is often when freshly minted PhDs start looking for non academic jobs in, e.g. industry.

Departments do not prepare students for this option. Truth be told, it is not clear that we are qualified to prepare students for this. Well, let me back up: some are. People doing work on computational linguistics often have industry connections and occasionally some people in the more expansive parts of the language sciences have connections to the helping professions in HESP. Students from UMD have gone on to get non academic jobs in both these areas, sometimes requiring further education. However, thee routes exist. that said, they are not common and faculty are generally not that well placed to advise on how to navigate this terrain.

What then to do to widen your options. Here is a paper from Nature that addresses the issues. Most of the advice is common sense; network, get things done, develop the soft skills that working on a PhD allows you to refine, get some tech savvy. All this makes sense. The one that I would like to emphasize is learn to explain what you are doing in a simple unencumbered way to others. This is really a remarkable skill, and good even if you stay in academia. However, in the outside world being able to explain complex things simply is a highly prized virtue.

At any rate, take a look. The world would be a better place if all graduates got the jobs they wanted. Sadly this is not that world. Here are some tips from someone who navigated the rough terrain.

Thursday, February 21, 2019

Omer on phases and minimality

I am not on Facebook. This means that I often miss some fun stuff, like Omer's posts on topics syntactic. Happily, he understands my problem and sends me links to his cogitations. For others sho may suffer from a similar Facebook phobia I link to his post here.

The topic is one that I have found intriguing for quite a while: do we really need two locality conditions. Two? Yes, Phases (aka, Bounding domains) and Minimality. Now, on their face these look quite different. The former places an absolute bound on computations, the latter bounds the reach of one expression when in the presence of another identical one. These two kinds of domain restrictions, thus, seem very different. However, looks can be deceiving. Not all phases count to delimit domains, at least if one buys into strong vs weak ones. If one does buy this then as strong v phases are transitive vs and transitive vs will implicate at least two nominals it looks like phases and minimality will both apply redundantly in these cases. Similarly it is possible to evade minimality and phase impenetrability using similar "tricks" (e.g. becoming specifiers of the same projection. At any rate, once one presses, it appears that the two systems generate significant redundancy which suggests that one of them might be dispensable. This is where Omer's post comes in. He shows that Minimality can apply in some cases where there is no obvious tenable phase based account (viz. phase internally). Well, whether this is right or not, the topic is a nice juicy one and well worth thinking about. Omer's post is a great place to begin.

Another logical problem of language acquisition: Part 1

Some of you may recall that I invited FoLers to submit stuff for blog fodder on the site. I have received a few takers, but not as enthusiastic as Callum Hackett. Below is the first part of an extended discussion based on his thesis topic. I like the idea of being in on the ground floor wrt this kind of stuff; new thoughts by young colleagues that leads me by the hand in new directions. I hope that Callum is the first of many who decide to educate the rest of us. Thx.

*****

Another logical problem of language acquisition: Part 1

Following on from various interesting discussions here on FoL, I’ve been meaning to elaborate on some of the comments I’ve made in places about how we might want to reconsider our grammatical architecture if we want generative theory to be a truly successful contributor to cognitive and evolutionary science. To help frame the major issues, in this first post I’m going to examine the basic logic of the competence/performance distinction, as well as some of its complications. Then, in a second, I’ll consider in more detail the actual relationship between competence and performance, and the implications this has for what our theory of competence ought to look like, if indeed having a theory of competence is what should qualify generative grammar as a science of the mind.

To advertise—these posts being a truncation of some of my doctoral work—so long as what we mean by ‘competence’ is the system of linguistic knowledge represented in the mind of a speaker, independent of any use that’s made of it, then my conclusion will be that any model of competence we can identify as a T-model (so, basically everything since Aspects) logically cannot work because the T-model misunderstands the relationship between knowledge and use.

Having said this, I also believe that we can devise a different architecture that preserves the best of the rest of generative theory, that gives us better stories to tell about cognition and evolution (and so better chances of being funded), and—my personal favourite—that allows us to make some strategic concessions to behaviourists that in the end seal the deal for nativism.

To make a start on all this, first we should recognize that a theory of competence is wholly inaccessible to us unless we have some idea of how competence relates to performance, simply because all of our analytical methods deal exclusively with performance data, due to the mental representations that are constitutive of competence being by definition inscrutable.

Of course, this quite trivial observation doesn’t mean that linguists need to know any of the details about howcompetence gets performed; it just means that, because all the data available to us always isperformed, we need at least an abstract conception of what performance amounts to, purely so we can factor it out.

So far, so uncontroversial. Didn’t generative theory already have this sorted by 1965? Chomsky does after all give a description of what we need to factor out at the very beginning of Aspects, where he introduces the competence/performance distinction with the remark that we’re interested in:

“an ideal speaker-listener, in a completely homogeneous speech-community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance.”

Of course, lots of (silly) people have objected to this as being much too idealistic but my complaint would be that it isn’t nearly idealistic enough, in that I don’t believe it properly circumscribes competence to the genuine exclusion of performance, as it has too narrow a view of what performance is.

The reasons for this are straightforward enough, though a little pedantic, so we need to think carefully about what idealization actually achieves for linguistic theory. Note foremost that, given the inscrutability of mental representations, there can never, even in principle, be circumstances that are somehow soideal that we could make direct observations of a speaker’s competence. In ideal conditions, we might eliminate all the distortions that can be introduced in performance, but we would still always be dealing with performed data.

Indeed, if you read the above quotation from Aspectsagain, you’ll see that Chomsky plainly isn’t talking about getting at mental representations directly, and he doesn’t mean to be. He’s still talking about performance, only perfectperformance—the ideal speaker-listener demonstrates unhindered applicationof knowledge, not knowledge itself. Thus, note how the distortions that Chomsky lists are not only grammatically irrelevant, they are also irrelevant to the successful performance of whatever a grammar contains.

Crucially, what this limitation to performance data means is that the role of idealization is not (and cannot be) to eliminate everything that impinges upon competence, so that we can get a good look at it. It’s rather to eliminate everything that impinges upon performance of competence—to make performance perfect—so that performance is left as the only thing to factor out of the data, just as soon as we have an independent idea of what it is.

This subtlety is vital for assessing a claim Chomsky subsequently makes about the basic methodology of linguistics: that we can construct a theory of competence from observations of performance guided by the principle that “under the idealization set forth [...] performance is a direct reflection of competence.” Straightforward though this seems, it does not follow as a simple point of definition.

We’ve just observed that idealization on its own does nothing to define or eliminate the influence of performance on our data—it just makes the data ready for when we have an independent idea of what performance is—so we can only take perfectperformanceto directlyreflectcompetence if we help ourselves to an ancillary assumption that performance itself just isthe reflection of competence (i.e. its externalization). This of course goes hand-in-hand with a definition of competence as the internal specification of what can be performed.

To use the somewhat more transparent terminology of Chomsky’s later I-language/E-language distinction, what this means altogether is that the theory of competence we have built using the Aspectsmethodology depends not only upon the basic distinction between competence and performance as a distinction between whatever’s internal and whatever’s external, but also upon a strictly additionalassumption that the internal and the external are related to one another in the way of an intension and extension.

So, why be pedantic about this? It may seem that defining the relationship between competence and performance as an intension and extension is so obviously valid—perhaps even a kind of logical truth—that observing it to be implicit from the get go in Aspectsis hardly worth comment. However, even if this definition is sound, it isn’t anything at all like a necessary truth, meaning that some justification must be found for it if we are to have confidence in a theory that takes it for granted.

To understand why, consider the fact that treating competence and performance as intension and extension casts performance as an entirely interpretativeprocess, in the sense that every performance of a structured sound-meaning pair is no more than the mechanical saying and understanding of a sound-meaning pair that is first specified by the competence system (and notice, here, how having an intensional specification of sound-meaning pairs is, by definition, what commits us to a T-model of competence).

Another conceptual possibility, however, is that the competence system might furnish certain resources of sound, structure and meaning for a performance process that is creative, in the sense that sounds and meanings might be paired afresh in each act of performance, totally unrestricted by any grammatical specification of how they should go together. This might seem like such a crazy alternative that it is no alternative at all, but in fact I’ve just described in quite uncontroversial terms the task of language acquisition.

We already knowthat performance has to be to some extent creative rather than interpretative because the competence system at birth is devoid of any language-specific content, so it initially has no capacity to specify any sound-meaning pairs for performance. Moreover, as the associations between sound and meaning that we learn are arbitrary, and are thus not predictable from purely linguistic input, the only way children have of formulating such associations is by observing in context how sounds are used meaningfully in performance. Thus, our question is really: to what extent is performance notcreative in this way, or to what extent does its creative element give way to interpretation?

Here, our standard story first concedes (at least implicitly) that, given the arbitrariness of the sound-meaning relation, performance must be involved at least in the creation of atomic sound-meaning pairs, or whatever else it is that constitutes the lexicon (it makes no difference, as you’ll see later). But, because the proper structural descriptions for sentences of these atoms cannot be inferred from their performance, given the poverty of the stimulus, there must also be an innate syntactic competence that generates these structures and thereby accounts for speakers’ unbounded productivity.

These propositions are, I believe, totally beyond reproach, and yet, taken together, they do notlicense the conclusion that linguists have drawn from them: that language-specific atoms created in performance therefore serve as input to the innate syntax, such that structured sound-meaning pairs are only ever interpreted in performance, rather than being created in it.

To expose why this inference is unwarranted, one thing we have to bear in mind in light of our consideration of the proper role of idealization is that there is simply nothing about the data that we study that can tell us directly whether performance is interpretative or not. Because we are always looking at performed data, the limit of what we can know for certain about any particular sound-meaning pair that we bring before our attention is just that it is at least one possible outcome of the performance process in the instance that we observe. To know furthermore whether some pairing originates in competence or performance requires us to know the cognitive relationship that holds between them, and this is not manifest in performance itself. In order to establish that relation, like any good Chomskyan we must draw on the logic of acquisition.

Now, before we can give a satisfying treatment of the poverty of the stimulus, we need to be a little more precise about what it means for performance to be involved in pairings of purely atomic sounds and meanings—whatever they are—as there are two possibilities here, one of which we must reject.

On the one hand, we might imagine that the meaning of an expression is somehow a property of the expression’s uses in communication, such that sound-meaning pairs are constructed in performance because meanings themselves are derived entirely from behaviour. This is the Skinnerian or Quinean view and, for all of Chomsky’s original reasons, we can dismiss it out of hand.

The alternative we might imagine is that the meaning of an expression is some sort of mental representation, independent of any behaviour (i.e. a concept, or something like one), and, following a Fodorian line of argument, if these mental representations cannot be derived from experience of language (and they can’t), then they must be pre-linguistic. Thus, the role for performance here is not to create meanings(in the sense of concepts), but rather to create the relationsbetween an innate repertoire of possible meanings and whichever pieces of language we observe to represent them (schematically, this is something like taking the innate concept BIRD and assigning it either ‘bird’ or ‘Vogel’, though there is a lot wrong with this description of the lexicon, as I’ll get to later).

A crucial corollary of this construction of atomic sound-meaning relations in performance is that at least our initial knowledge of such relations must not (indeed, cannot) consist of having mentally represented lexical entries for them, as the fact that we have to construct our lexicons by observinglanguage use, given the arbitrariness of their contents, means they cannot also in the first instance determinelanguage use, as that would be circular (another way of stating this is to ask: if you know that ‘bird’ means BIRD only because your lexicon tells you so, how did that information ever get into your lexicon when it was empty?).

But by now, it should be clear that the competence/performance distinction is not so straightforward as a distinction between knowledge and use because the means by which we come to know at least some sound-meaning relations is a matter of performance. This being the case, an important question we must ask is: why can’t we go on constructing such relations in performance indefinitely, becoming better and better at it as we gain more and more experience? What need do we have of a competence system to at some point specify such relations intensionally in the form of mentally represented sound-meaning pairs?

To pose this question more starkly, we have traditionally assumed that a person understands the meanings of words by having a mentally represented dictionary, somehowacquired from the environment, yet given the fact that children are not born with lexicons and nonetheless come to have lexical knowledge, isn’t the lesson to learn from this that a lexicon is not necessary for such knowledge, and so the specification of word meanings is just not what lexicons are for? Note that these questions apply if you suppose a mental lexicon to list pairings of sounds and anysorts of mental representation, be they atomic concepts, feature sets, chunks of syntactic structure, or whatever else your derivational framework prefers.

As it happens, the lexicon in linguistic theory was never really devised to account for lexical knowledge in any straightforward way. Ever since Aspects, the lexicon has been little more than a repository for just whatever speakers seem to need as input to syntactic derivation in order to produce and understand structured expressions, without there being any independent constraints on what a lexicon can or cannot contain. So, to see where this acquisition conundrum really leads, we finally have to turn to the poverty of the stimulus.

Here, though, I will leave you with a cliff-hanger, for two reasons. First, in the next section of the argument, we’ll have to deal with some subsidiary misconceptions about descriptive and explanatory adequacy, as well as what (an) E-language is (if anything at all), yet I’ve already gone on quite enough for one post.

Second, though the poverty of the stimulus introduces some new dimensions to the argument, the logical structure of what follows will in essence be the same as what you’ve just read, so that the conclusion we’ll come to is that the fact that children are not born with language-specific grammars and yet nonetheless come to understand specific languages entails that it cannot be the function of an acquired grammar to specify the meanings of the expressions of a language, yet this is precisely what T-model grammars are designed to do. It’s no use getting on with the extra details needed for that conclusion, however, if you’re not at least on board with what I’ve said so far. I think it should all be rather uncontroversial for the FoL audience, but I’d like to invite comments nonetheless, in case it might be helpful to express the second half a little differently.