Wednesday, February 27, 2019

When academic jobs are hard to get

When I first graduated with a PhD an academic job was not assured. Indeed, at the time (the mid 1970s into the the mid 1980s) MIT was sending out acceptance letters warning that academic jobs were not thick on the ground and though they could assure four wonderful years of intellectual ferment and excitement, whether these would be rewarded with an academic job at the end was quite unclear. This was their version of buyer beware.

If anything, my impression is that things have gotten worse. Even those that land decent jobs often do so after several years as Post Docs (not a bad gig actually, I had several) and even then people that have all the qualifications for academic appointment (i.e. had better CVs than me and my friends had when we entered the job market and landed positions) may not find anything at all. This is often when  freshly minted PhDs start looking for non academic jobs in, e.g. industry.

Departments do not prepare students for this option. Truth be told, it is not clear that we are qualified to prepare students for this. Well, let me back up: some are. People doing work on computational linguistics often have industry connections and occasionally some people in the more expansive parts of the language sciences have connections to the helping professions in HESP. Students from UMD have gone on to get non academic jobs in both these areas, sometimes requiring further education.  However, thee routes exist. that said, they are not common and faculty are generally not that well placed to advise on how to navigate this terrain.

What then to do to widen your options. Here is a paper from Nature that addresses the issues. Most of the advice is common sense; network, get things done, develop the soft skills that working on a PhD allows you to refine, get some tech savvy. All this makes sense. The one that I would like to emphasize is learn to explain what you are doing in a simple unencumbered way to others.  This is really a remarkable skill, and good even if you stay in academia. However, in the outside world being able to explain complex things simply is a highly prized virtue.

At any rate, take a look. The world would be a better place if all graduates got the jobs they wanted. Sadly this is not that world. Here are some tips from someone who navigated the rough terrain.

Thursday, February 21, 2019

Omer on phases and minimality

I am not on Facebook. This means that I often miss some fun stuff, like Omer's posts on topics syntactic. Happily, he understands my problem and sends me links to his cogitations. For others sho may suffer from a similar Facebook phobia I link to his post here.

The topic is one that I have found intriguing for quite a while: do we really need two locality conditions. Two? Yes, Phases (aka, Bounding domains) and Minimality. Now, on their face these look quite different. The former places an absolute bound on computations, the latter bounds the reach of one expression when in the presence of another identical one. These two kinds of domain restrictions, thus, seem very different. However, looks can be deceiving. Not all phases count to delimit domains, at least if one buys into strong vs weak ones. If one does buy this then as strong v phases are transitive vs and transitive vs will implicate at least two nominals it looks like phases and minimality will both apply redundantly in these cases. Similarly it is possible to evade minimality and phase impenetrability using similar "tricks" (e.g. becoming specifiers of the same projection. At any rate, once one presses, it appears that the two systems generate significant redundancy which suggests that one of them might be dispensable.  This is where Omer's post comes in. He shows that Minimality can apply in some cases where there is no obvious tenable phase based account (viz. phase internally). Well, whether this is right or not, the topic is a nice juicy one and well worth thinking about. Omer's post is a great place to begin.

Another logical problem of language acquisition: Part 1

Some of you may recall that I invited FoLers to submit stuff for blog fodder on the site. I have received a few takers, but not as enthusiastic as Callum Hackett.  Below is the first part of an extended discussion based on his thesis topic. I like the idea of being in on the ground floor wrt this kind of stuff; new thoughts by young colleagues that leads me by the hand in new directions. I hope that Callum is the first of many who decide to educate the rest of us. Thx.


Another logical problem of language acquisition: Part 1

Following on from various interesting discussions here on FoL, I’ve been meaning to elaborate on some of the comments I’ve made in places about how we might want to reconsider our grammatical architecture if we want generative theory to be a truly successful contributor to cognitive and evolutionary science. To help frame the major issues, in this first post I’m going to examine the basic logic of the competence/performance distinction, as well as some of its complications. Then, in a second, I’ll consider in more detail the actual relationship between competence and performance, and the implications this has for what our theory of competence ought to look like, if indeed having a theory of competence is what should qualify generative grammar as a science of the mind.

To advertise—these posts being a truncation of some of my doctoral work—so long as what we mean by ‘competence’ is the system of linguistic knowledge represented in the mind of a speaker, independent of any use that’s made of it, then my conclusion will be that any model of competence we can identify as a T-model (so, basically everything since Aspects) logically cannot work because the T-model misunderstands the relationship between knowledge and use.

Having said this, I also believe that we can devise a different architecture that preserves the best of the rest of generative theory, that gives us better stories to tell about cognition and evolution (and so better chances of being funded), and—my personal favourite—that allows us to make some strategic concessions to behaviourists that in the end seal the deal for nativism.

To make a start on all this, first we should recognize that a theory of competence is wholly inaccessible to us unless we have some idea of how competence relates to performance, simply because all of our analytical methods deal exclusively with performance data, due to the mental representations that are constitutive of competence being by definition inscrutable.

Of course, this quite trivial observation doesn’t mean that linguists need to know any of the details about howcompetence gets performed; it just means that, because all the data available to us always isperformed, we need at least an abstract conception of what performance amounts to, purely so we can factor it out.

So far, so uncontroversial. Didn’t generative theory already have this sorted by 1965? Chomsky does after all give a description of what we need to factor out at the very beginning of Aspects, where he introduces the competence/performance distinction with the remark that we’re interested in:

 “an ideal speaker-listener, in a completely homogeneous speech-community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance.”

Of course, lots of (silly) people have objected to this as being much too idealistic but my complaint would be that it isn’t nearly idealistic enough, in that I don’t believe it properly circumscribes competence to the genuine exclusion of performance, as it has too narrow a view of what performance is.

The reasons for this are straightforward enough, though a little pedantic, so we need to think carefully about what idealization actually achieves for linguistic theory. Note foremost that, given the inscrutability of mental representations, there can never, even in principle, be circumstances that are somehow soideal that we could make direct observations of a speaker’s competence. In ideal conditions, we might eliminate all the distortions that can be introduced in performance, but we would still always be dealing with performed data.

Indeed, if you read the above quotation from Aspectsagain, you’ll see that Chomsky plainly isn’t talking about getting at mental representations directly, and he doesn’t mean to be. He’s still talking about performance, only perfectperformance—the ideal speaker-listener demonstrates unhindered applicationof knowledge, not knowledge itself. Thus, note how the distortions that Chomsky lists are not only grammatically irrelevant, they are also irrelevant to the successful performance of whatever a grammar contains.

Crucially, what this limitation to performance data means is that the role of idealization is not (and cannot be) to eliminate everything that impinges upon competence, so that we can get a good look at it. It’s rather to eliminate everything that impinges upon performance of competence—to make performance perfect—so that performance is left as the only thing to factor out of the data, just as soon as we have an independent idea of what it is.

This subtlety is vital for assessing a claim Chomsky subsequently makes about the basic methodology of linguistics: that we can construct a theory of competence from observations of performance guided by the principle that “under the idealization set forth [...] performance is a direct reflection of competence.” Straightforward though this seems, it does not follow as a simple point of definition.

We’ve just observed that idealization on its own does nothing to define or eliminate the influence of performance on our data—it just makes the data ready for when we have an independent idea of what performance is—so we can only take perfectperformanceto directlyreflectcompetence if we help ourselves to an ancillary assumption that performance itself just isthe reflection of competence (i.e. its externalization). This of course goes hand-in-hand with a definition of competence as the internal specification of what can be performed.

To use the somewhat more transparent terminology of Chomsky’s later I-language/E-language distinction, what this means altogether is that the theory of competence we have built using the Aspectsmethodology depends not only upon the basic distinction between competence and performance as a distinction between whatever’s internal and whatever’s external, but also upon a strictly additionalassumption that the internal and the external are related to one another in the way of an intension and extension.

So, why be pedantic about this? It may seem that defining the relationship between competence and performance as an intension and extension is so obviously valid—perhaps even a kind of logical truth—that observing it to be implicit from the get go in Aspectsis hardly worth comment. However, even if this definition is sound, it isn’t anything at all like a necessary truth, meaning that some justification must be found for it if we are to have confidence in a theory that takes it for granted.

To understand why, consider the fact that treating competence and performance as intension and extension casts performance as an entirely interpretativeprocess, in the sense that every performance of a structured sound-meaning pair is no more than the mechanical saying and understanding of a sound-meaning pair that is first specified by the competence system (and notice, here, how having an intensional specification of sound-meaning pairs is, by definition, what commits us to a T-model of competence).

Another conceptual possibility, however, is that the competence system might furnish certain resources of sound, structure and meaning for a performance process that is creative, in the sense that sounds and meanings might be paired afresh in each act of performance, totally unrestricted by any grammatical specification of how they should go together. This might seem like such a crazy alternative that it is no alternative at all, but in fact I’ve just described in quite uncontroversial terms the task of language acquisition.

We already knowthat performance has to be to some extent creative rather than interpretative because the competence system at birth is devoid of any language-specific content, so it initially has no capacity to specify any sound-meaning pairs for performance. Moreover, as the associations between sound and meaning that we learn are arbitrary, and are thus not predictable from purely linguistic input, the only way children have of formulating such associations is by observing in context how sounds are used meaningfully in performance. Thus, our question is really: to what extent is performance notcreative in this way, or to what extent does its creative element give way to interpretation?

Here, our standard story first concedes (at least implicitly) that, given the arbitrariness of the sound-meaning relation, performance must be involved at least in the creation of atomic sound-meaning pairs, or whatever else it is that constitutes the lexicon (it makes no difference, as you’ll see later). But, because the proper structural descriptions for sentences of these atoms cannot be inferred from their performance, given the poverty of the stimulus, there must also be an innate syntactic competence that generates these structures and thereby accounts for speakers’ unbounded productivity.

These propositions are, I believe, totally beyond reproach, and yet, taken together, they do notlicense the conclusion that linguists have drawn from them: that language-specific atoms created in performance therefore serve as input to the innate syntax, such that structured sound-meaning pairs are only ever interpreted in performance, rather than being created in it.

To expose why this inference is unwarranted, one thing we have to bear in mind in light of our consideration of the proper role of idealization is that there is simply nothing about the data that we study that can tell us directly whether performance is interpretative or not. Because we are always looking at performed data, the limit of what we can know for certain about any particular sound-meaning pair that we bring before our attention is just that it is at least one possible outcome of the performance process in the instance that we observe. To know furthermore whether some pairing originates in competence or performance requires us to know the cognitive relationship that holds between them, and this is not manifest in performance itself. In order to establish that relation, like any good Chomskyan we must draw on the logic of acquisition.

Now, before we can give a satisfying treatment of the poverty of the stimulus, we need to be a little more precise about what it means for performance to be involved in pairings of purely atomic sounds and meanings—whatever they are—as there are two possibilities here, one of which we must reject.

On the one hand, we might imagine that the meaning of an expression is somehow a property of the expression’s uses in communication, such that sound-meaning pairs are constructed in performance because meanings themselves are derived entirely from behaviour. This is the Skinnerian or Quinean view and, for all of Chomsky’s original reasons, we can dismiss it out of hand.

The alternative we might imagine is that the meaning of an expression is some sort of mental representation, independent of any behaviour (i.e. a concept, or something like one), and, following a Fodorian line of argument, if these mental representations cannot be derived from experience of language (and they can’t), then they must be pre-linguistic. Thus, the role for performance here is not to create meanings(in the sense of concepts), but rather to create the relationsbetween an innate repertoire of possible meanings and whichever pieces of language we observe to represent them (schematically, this is something like taking the innate concept BIRD and assigning it either ‘bird’ or ‘Vogel’, though there is a lot wrong with this description of the lexicon, as I’ll get to later).

A crucial corollary of this construction of atomic sound-meaning relations in performance is that at least our initial knowledge of such relations must not (indeed, cannot) consist of having mentally represented lexical entries for them, as the fact that we have to construct our lexicons by observinglanguage use, given the arbitrariness of their contents, means they cannot also in the first instance determinelanguage use, as that would be circular (another way of stating this is to ask: if you know that ‘bird’ means BIRD only because your lexicon tells you so, how did that information ever get into your lexicon when it was empty?).

But by now, it should be clear that the competence/performance distinction is not so straightforward as a distinction between knowledge and use because the means by which we come to know at least some sound-meaning relations is a matter of performance. This being the case, an important question we must ask is: why can’t we go on constructing such relations in performance indefinitely, becoming better and better at it as we gain more and more experience? What need do we have of a competence system to at some point specify such relations intensionally in the form of mentally represented sound-meaning pairs?

To pose this question more starkly, we have traditionally assumed that a person understands the meanings of words by having a mentally represented dictionary, somehowacquired from the environment, yet given the fact that children are not born with lexicons and nonetheless come to have lexical knowledge, isn’t the lesson to learn from this that a lexicon is not necessary for such knowledge, and so the specification of word meanings is just not what lexicons are for? Note that these questions apply if you suppose a mental lexicon to list pairings of sounds and anysorts of mental representation, be they atomic concepts, feature sets, chunks of syntactic structure, or whatever else your derivational framework prefers.

As it happens, the lexicon in linguistic theory was never really devised to account for lexical knowledge in any straightforward way. Ever since Aspects, the lexicon has been little more than a repository for just whatever speakers seem to need as input to syntactic derivation in order to produce and understand structured expressions, without there being any independent constraints on what a lexicon can or cannot contain. So, to see where this acquisition conundrum really leads, we finally have to turn to the poverty of the stimulus.

Here, though, I will leave you with a cliff-hanger, for two reasons. First, in the next section of the argument, we’ll have to deal with some subsidiary misconceptions about descriptive and explanatory adequacy, as well as what (an) E-language is (if anything at all), yet I’ve already gone on quite enough for one post.

Second, though the poverty of the stimulus introduces some new dimensions to the argument, the logical structure of what follows will in essence be the same as what you’ve just read, so that the conclusion we’ll come to is that the fact that children are not born with language-specific grammars and yet nonetheless come to understand specific languages entails that it cannot be the function of an acquired grammar to specify the meanings of the expressions of a language, yet this is precisely what T-model grammars are designed to do. It’s no use getting on with the extra details needed for that conclusion, however, if you’re not at least on board with what I’ve said so far. I think it should all be rather uncontroversial for the FoL audience, but I’d like to invite comments nonetheless, in case it might be helpful to express the second half a little differently.

Thursday, February 14, 2019

Shut up and dance

In Science Advances today, some unexpected (to me) findings about the honeybee waggle dance:

"Strikingly, colonies with disoriented dances had greater foraging success. Over time, bees exposed to disoriented dances showed reduced interest in dancing nestmates. This may explain why disoriented colonies had a higher foraging rate than oriented colonies, as bees did not waste time waiting for information."

Wednesday, February 6, 2019

Bee accountants

Today in Science Advances, "Numerical cognition in honeybees enables addition and subtraction".

From the abstract: "... We show that honeybees, with a miniature brain, can learn to use blue and yellow as symbolic representations for addition or subtraction. In a free-flying environment, individual bees used this information to solve unfamiliar problems involving adding or subtracting one element from a group of elements. This display of numerosity requires bees to acquire long-term rules and use short-term working memory.  ..."