Morten Christiansen and Nick Chater have done us all a
favor. They have written a manifesto (here, C&C)
outlining what they take to be a fruitful way of studying language. To the
degree that I understand it, it seems plausible enough given its apparent
interests. It focuses on the fact that language as we encounter it on a daily
basis is a massive interaction effect and the manifesto heroically affirms the
truism (just love those papers that bravely go out on a limb (the academic
version of exercise?)) that explaining interaction effects requires an
“integrated approach.” Let me emphasize how truistic this is: if X is the
result of many interacting parts then only a story that enumerates these parts,
describes their properties and explains how they interact can explain an effect
that is the result of interacting parts interacting. Thus a non-integrated account of an interaction
effect is a logical non-starter. It is also worth pointing out the obvious: this
is not a discovery, it is a tautology (and a rather superficial one at that),
and not one that anyone (and here I include C&C’s bête noir Monsieur
Chomsky (I am just back from vacationing in Quebec so excuse the francotropism))
can, should or would deny (in fact, we note below that he made just this
observation oh so many (over 60) years ago!).
That said, C&C, from where I sit, does make two
interesting moves that go beyond the truistic. The first is that it takes the
central truism to be revolutionary and in need of defence (as if anyone in
their right mind would ever deny it). The second noteworthy feature is that the
transparent truth of the truism (note truisms need not be self-evident (think
theorems) but this one is) seems to license a kind of faith based holism, one
that goes some distance in thwarting the possibility of a non-trivial
integrated approach of the kind C&C urges.
Before elaborating these points in more detail, I need (the
need here is pathetically psychological, sort of like a mental tic, so excuse
me) to make one more overarching point: C&C appears to have no idea what GG
is, what its aims are or what it has accomplished over the last 60 years. In
other words, when C&C talks about GG (especially (but not uniquely) about the
Chomsky program) it is dumb, dumb, dumb! And it is not even originally dumb. It
is dumb in the old familiar way. It is boringly dumb. It is stale dumb. Dumb at
one remove from other mind numbing dumbness. Boringly, predictably,
pathetically dumb. It makes one wonder whether or not the authors ever read any
GG material. I hope not. For having failed to read what it criticizes would be
the only half decent excuse for the mountains of dumb s**t that C&C
asserts. If I were one of the authors, I would opt for intellectual
irresponsibility (bankruptcy (?)) over immeasurable cluelessness if hauled
before a court of scientific inquiry. At any rate, not having ever read the
material better explains the wayward claims confidently asserted than having
read it and so misconstrued it. As I have addressed C&C’s second hand
criticisms more than (ahem) once, I will allow the curious to scan the archives
for relevant critical evisceration.[1]
Ok, the two main claims: It is a truism that language
encountered “in the wild” is the product of many interacting parts. This
observation was first made in the modern period in Syntactic Structures. In other words, a long time ago.[2]
In other words, the observation is old, venerable and, by now, commonplace. In
fact, the distinction between ‘grammatical’ and ‘acceptable’ first made over 60
years ago relies on the fact that a speaker’s phenomenology wrt utterances is
not exclusively a function of an uttered sentence’s grammatical (G) status.
Other things matter, a lot. In the early days of GG, factors such as memory
load, attention, pragmatic suitability, semantic sensibility (among other
factors) were highlighted in addition to
grammaticality. So, it was understood early on that many many factors went
into an acceptability judgment, with grammaticality being just one relevant
feature. Indeed, this observation is what lies behind the
competence/performance distinction (a point that C&C seems not to
appreciate (see p. 3), the distinction aiming to isolate the grammatical factors behind acceptability,
thereby, among other things, leaving room for other factors to play a role.[3]
And this was not just hand waving or theory protecting talk
(again contra C&C, boy is its discussion DUMB!!). A good deal of work was
conducted early on trying to understand how grammatical structure could
interact with these other factors to burden memory load and increase perceived
unacceptability (just think of the non-trivial distinction between center and
self embedding and its implications for memory architecture).[4]
This kind of work proceeds apace even today, with grammaticality understood to
be one of the many factors that go into making judgments gradiently acceptable.[5]
Indeed, there is no grammatically informed psycho-linguistic work done today
(or before) that doesn’t understand that G/UG capacities are but one factor among others needed to
explain real time acquisition, parsing, production, etc. UG is one factor in accounting for G
acquisition (as Jeff Lidz, Charles Yang, Lila Gleitman etc. have endlessly emphasized)
and language particular Gs are just one
factor in explaining parsability (which is, in turn, one factor in underlying acceptability) (as Colin Phillips, Rick
Lewis, Shravan Vasishth, Janet Fodor, Bob Berwick, Lyn Frazier, Jon Sprouse, etc.
etc. etc. have endlessly noted). Nobody denies the C&C truism that language
use involves multiple interacting
variables. Nobody is that stupid!
So, C&C is correct in noting that if one’s interest is in figuring out how language is
deployed/acquired/produced/parsed/etc. then much more than a competence theory
will be required. This is not news. This is not even an insight. The question
is not if this is so, but how it is so. Given this, the relevant
question is: what tree is C&C barking up by suggesting that this is
contentious?
I have two hypotheses. Here they are.
1. C&C
doesn’t take G features to be at all
relevant to acceptability.
2. C&C
favors a holistic rather than an analytic approach to explaining interaction
effects in language.
Let’s discuss each in turn.
C&C is skeptical that grammaticality is a real feature
of natural language expressions. In other words, C&C's beef with the
traditional GG conception in which G/UG properties are one factor among many lies with assigning G/UG any role at all.
This is not as original as it might sound. In fact, it is quite a traditional
view, one that Associationists and Structuralists held about 70 years ago. It
is the view that GG defenestrated, but apparently, did not manage to kill (next
time from a higher floor please). The view amounts to the idea that G
regularities (C&C is very skeptical that UG properties exist at all, I
return to this presently) are just probabilistic generalizations over available
linguistic inputs. This is the view embodied in Structuralist discovery
procedures (and suggested in current Deep Learning approaches) wherein levels
were simple generalizations over induced structures of a previous lower level.
Thus, all there is to grammar is successively more abstract categories built up
inductively from lower level less abstract categories. On this view, grammatical categories are classes of words,
which are definable as classes of morphemes, which are definable as classes of
phonemes, which are definable as classes of phones. The higher levels are, in effect,
simple inductive generalizations over lower level entities. The basic thought is
that higher-level categories are entirely reducible to lower level
distributional patterns. Importantly, in this sort of analysis, there are no
(and can be no) interesting theoretical entities, in the sense of real abstract
constructs that have empirical consequences but are not reducible or definable
in purely observational terms. In other words, on this view, syntax is an
illusion and the idea that it makes an autonomous contribution to acceptability
is a conceptual error.
Now, I am not sure
whether C&C actually endorses this view, but it does make noises in that
direction. For example, it endorses a particular conception of constructions and
puts it “at the heart” of its “alternative framework” (4). The virtues of C&C
constructions is that they are built up from smaller parts in a
probabilistically guided manner. Here is C&C (4):
At
the heart of this emerging alternative framework are constructions , which
are learned pairings of form and meaning
ranging from meaningful parts of words (such as word endings, for example,
‘-s’, ‘-ing’) and words themselves (for example, ‘penguin’) to multiword
sequences (for example, ‘cup of tea’) to lexical patterns and schemas (such as,
‘the X-er, the Y-er’, for example, ‘the bigger, the better’). The
quasi-regular nature of such construction grammars allows them to capture both
the rule-like patterns as well as the myriad of exceptions that often are
excluded by fiat from the old view built on abstract rules. From this point of
view, learning a language is learning the skill of using constructions to
understand and produce language. So, whereas the traditional perspective viewed
the child as a mini-linguist with the daunting task of deducing a formal
grammar from limited input, the construction-based framework sees the child as
a developing language-user, gradually honing her language-processing skills.
This requires no putative universal grammar but, instead, sensitivity to
multiple sources of probabilistic information available in the linguistic
input: from the sound of words to their co-occurrence patterns to information
from semantic and pragmatic contexts.
This quote does not preclude
a distinctive Gish contribution to acceptability, but its dismissal of any UG
contribution to the process suggests that it is endorsing a very strong
rejection of the autonomy of syntax thesis.[6]
Let me repeat, a commitment to the
centrality of constructions does not require this. However, the C&C
version seems to endorse it. If this is correct, then C&C sees the central
problem with modern GG is its commitment to the idea that syntactic structure
is not reducible to either
statistical distributional properties or semantic or pragmatic or phonological
or phonetic properties of utterances. In other words, C&C rejects the GG
idea that grammatical structure is real and makes any contribution to the
observables we track through acceptability.
This view is indeed radical, and virtually certain to be
incorrect.[7]
If there is one thing that all
linguists agree on (including constructionists like Jackendoff and Culicover)
it’s that syntax is real. It is not
reducible to other factors. And if this is so, then G structure exists
independently of other factors. I also think that virtually all linguists
believe that syntax is not the sum of statistical regularity in the PLD.[8]
And there is good reason for this; it is morally certain that many of the
grammatical factors that linguists have identified over the last 60 years have
linguistically proprietary roots and leave few footmarks in the PLD. To argue
that this standard picture is false requires a lot of work, none of which
C&C does or points to. Of course,
C&C cannot be held responsible for this failing, for C&C has no idea
what this work argues because C&C’s authors appear never to have never read
any of it (or, if it has been read, it has not been understood, see above). But
were C&C informed by any of this work, it would immediately appreciate that
it is nuts to think that it is possible to eliminate G features as one factor in acceptability.[9]
In sum, one possible reading of C&C is that it endorses
the old Structuralist idea of discovery procedures, denies the autonomy of
syntax thesis (i.e. the thesis that syntax is “real”) and believes in the (yes
I got to say it) the old Empiricist/Associationist trope that language capacity
is nothing but a reflection of tracked statistical regularities. It’s back
folks. No idea ever really dies, no matter how unfounded and implausible and
how many times it has been stabbed through the heart with sharp arguments.
Before going on to the second point, let me add a small
digression concerning constructions. Look, anyone who works on the G of a
particular language endorses some form of constructionism (see here
for some discussion). Everyone assumes that morphemes have specific
requirements, with specific selection restrictions. These are largely
diacritical and part of the lexical entry of the morpheme. Gs are often
conceived as checking these features in the course of a derivation and one of
the aims of a theory of Gs (UG) is to specify the structural/derivational
conditions that regulate this feature checking. Thus, everyone’s favorite
language specific G has some kinds of constructions that encode information
that is not reducible to FL or UG principles (or not so reducible as far as we
can tell).
Moreover, it is entirely consistent with this view that
units larger than morphemes code this kind of information. The diacritics can
be syncategorematic and might grace structures that are pretty large (though
given something like an X’ syntax with heads or a G with feature percolation
the locus of the diacritical information can often be localized on a “listable”
linguistic object on the lexicon). So, the idea that C&C grabs with both
hands and takes to be new and revolutionary is actually old hat. What
distinguishes the kind of constructionism one finds in C&C from the more
standard variety found in standard work is the idea central to GG that constructions
are not “arbitrary.” Rather, constructions have a substructure regulated by
more abstract principles of grammar (and UG). C&C seems to think that
anything can be a construction. But we know that this is false.[10]
Constructions obey standard principles of Grammar (e.g. no mirror image
constructions, no constructions that violate the ECP or binding theory, etc.).
So though there can be many kinds of constructions that compile all sorts of
diverse information there are some pretty hard constraints regulating what a
possible construction is.
Why do I mention this? Because I could not stop myself!
Constructions lie at the heart of C&C’s “alternative framework” and
nonetheless C&C has no idea what they are, that they are standard fare in
much of standard GG (even minimalist Gs are knee deep in such diacritical
features) and that they are not the arbitrary pairings that C&C takes them
to be. In other words, once again C&C is mind numbingly ignorant (or,
misinformed).
So that’s one possibility. C&C denies G properties are
real. There is a second possible assumption, one that does not preclude this
one and is often found in tandem with it, but is nonetheless different. The
second problem C&C sees with the standard view lies with its analytical
bent. Let me explain.
The standard view of performance within linguistics is that
it involves contributions of many factors. Coupled with this is a methodology: The
right way to study these is to identify the factors involved, figure out their
particular features and see how they combine in complex cases. One of the
problems with studying such phenomena is that the interacting factors don’t
always nicely add up. In other words,
we cannot just add the contributions of each component together to get a nice well-behaved
sum at the end. That’s what makes some problems so hard to solve analytically
(think turbulence). But, that’s still the standard way to go about
matters. GG endorsed this view from the
get-go. To understand how language works in the wild, figure out what factors
go into making, say, an utterance, and see how these factors interact.
Linguists focused on one factor (G and UG) but understood that other factors
also played a role (e.g. memory, attention, semantic/pragmatic suitability
etc.). The idea was that in analyzing (and understanding) any bit of linguistic
performance, grammar would be one part of the total equation, with its own
distinctive contribution.[11]
Two things are noteworthy about this. First, it is hard,
very hard. It requires understanding how (at least) two “components” function
as well as understanding how they interact.
As interactions need not be additive, this can be a real pain, even
under ideal conditions where we really know a lot (that’s why engineers need to
do more than simply apply the known physics/chemistry/biology). Moreover,
interaction effects can be idiosyncratic and localized, working differently in
different circumstances (just ask your favorite psycho/neuro linguist about
task effects). So, this kind of work is both very demanding and subject to
quirky destabilizing effects. Recall Fodor’s observation: the more modular a
problem is, the more likely it is solvable at all. This reflects the problems
that interaction effects generate.[12]
At any rate, this is the standard way science proceeds when
approaching complex phenomena. It factors it into its parts and then puts these
parts back together. It is often called atomism or reductionsism but it is
really just analysis with synthesis and it has proven to be the only real game
in town.[13]
That said, many bridle at this approach and yearn for more holistic methods.
Connectionsists used to sing the praises of holism: only the whole system
computes! You cannot factor a problem into its parts without destroying it.
Holists often urge simulation in place of analysis (let’s see how the whole
thing runs). People like me find this to be little more than the promotion of
obscurantism (and not only me, see here
for a nice take down in the domain of face perception).
Why do I mention this here? Because, there is a sense in
which C&C seems to object not only to the idea that Grammar is real, but
also to the idea that the right way to approach these interaction effects is
analytically. C&C doesn’t actually say this, but it suggests it in its
claims that the secret to understanding language in the wild lies with how all
kinds of information are integrated quickly in the here and now. The system as
a whole gives rise to structure (which “may [note the weasel word here, btw,
NH] explain why language structure and processing is highly local in the
linguistic signal” (5))[14]
and the interaction of the various factors eases the interpretation problem
(though as Gleitman and Trusewell and friends have shown, having too much
information is itself a real problem (see here,
here
and here.)).
The prose in C&C suggests to me that only at the grain of the blooming
buzzing interactive whole will linguistic structure emerge. If this is right,
then the problem with the standard view is not merely that it endorses the
reality of grammar, but that it takes the right approach to be analytic rather
than holistic. Again, C&C does not expressly say this, but it does suggest it, and it makes sense of its
dismissal of “fragmented” investigations of the complex phenomenon. In their
view, we need to solve all the problems at once and together, rather than
piecemeal and then fit them together. Of course, we all know that there is no
“best” way to proceed in these complex matters; that sometimes a more focused
view is better and sometimes a more expansive one. But the idea that an
analytic approach is “doomed to fail” (1) surely bespeaks an antipathy towards
the analytic approach to language.
An additional point: note that if one thinks that all there
is to language is statistically piecing together of diverse kinds of
information then one is really against the idea that language in the wild is
the result of interacting distinct modules with their own powers and
properties. This, again, is an old idea. Again you all know who believed this
(hint, starts with an E). So, if one were looking for an overarching unifying
theme in C&C, one that is not trotted out explicitly but holds the paper
together, then one could do worse than look to Associationism/Empiricism. This
is the glue that holds the various parts together, from the hostility to the
very idea that grammars are real to the conviction that the analytic approach
(standard in the sciences) is doomed to failure.
There is a lot of other stuff in this paper that is also not
very good (or convincing). But, I leave it as an exercise to the reader to find
these and dispose of them (take a look at the discussion of language and
cultural evolution for a real good time (on p.6). I am not sure, but it struck
me as verging on the incoherent and mixing up the problem of language change
with the problem of the emergence of a facility for language). Suffice it to
say that C&C adds another layer to the pile of junk written on language
perpetrated on the innocent public by the prestige journals. Let me end with a
small rant on this.
C&C appeared in Nature.
This is reputed to be a fancy journal with standards (don’t believe it for a
moment. It’s all show business now).[15]
I doubt that Nature believes that it
publishes junk. Maybe it takes it to be impossible to evaluate opinion or
“comment” pieces. Maybe it thinks that taste cannot be adjudicated. Maybe. But
I doubt it. Rather, what we are witnessing here is another case of Chomsky
bashing, with GG as collateral damage. It is not only this, but it is some of this. The other factor is the rise of
big data science. I will have something to say about this in a later post. For
now, take a look at C&C. It’s the latest junk installment of a story that doesn’t
get better with repetition. But in this case all the arguments are stale as
well as being dumb. Maybe their shelf expiration date will come soon. One can
only hope even if such hope is irrational given the evidence.
[1]
Type ‘Evans and Levinson’ (cited in C&C) or ‘Vyvyan Evans’ or ‘Everett’ in
the search section for a bevy of replies to the old tired incorrect claims that
C&C throws out like confetti at a victory parade.
[2]
Actually, I assume that Chomsky’s observations are just another footnote to
Plato or Aristotle, though I don’t know what text he might have been footnoting
but, as you know the guy LOVES footnotes!
[3]
The great sin of Generative Semantics was to conflate grammaticality and acceptability
by, in effect, treating any hint of unacceptability as something demanding a
grammatical remedy.
[4]
I should add that the distinction between these two kinds of structures (center
vs self embedding) is still often ignored or run together. At time, it makes
one despair about whether there is any progress at all in the mental sciences.
[5]
And which, among other things, led Chomsky to deny that there is a clean
grammatical/ungrammatical distinction, insisting that there are degrees of
grammaticality as well as the observed degrees of acceptability. Jon Sprouse is
the contemporary go-to person on these issues.
[6]
And recall, the autonomy of syntax thesis is a very weak claim. It states that
syntactic structure is real and hence not reducible to observable features of
the linguistic expression. So syntax is not just a reflex of meaning or sound
or probabilistic distribution or pragmatic felicity or… Denying this weak claim
is thus a very strong position.
[7]
There is an excellent discussion of the autonomy of syntax and what it means
and why it is important in the forthcoming anniversary volume on Syntactic
Structures edited by Lasnik, Patel-Grosz, Yang et moi. It will make a great
stocking stuffer for the holidays so shop early and often.
[8]
Certainly Jackendoff, the daddy of constructionism has written as much.
[9]
Here is a good place to repeat sotto voce and reverentially: ‘Colorless green
ideas sleep furiously’ and contrast it with ‘Furiously sleep ideas green
colorless.’
[10]
Indeed, if I am right about the Associationist/Empiricist subtext in C&C
then C&C does not actually believe that there are inherent limits on
possible constructions. On this reading of C&C the absence of mirror image
constructions is actually just a fact about their absence in the relevant linguistic
environment. They are fine potential constructions. They just happen not to
occur. One gets a feeling that this is
indeed what C&C thinks by noting how impressed it is with “the
awe-inspiring diversity of the world’s languages” (6). Clearly C&C favors
theories that aim for flexibility to cover this diversity. Linguists, in
contrast, often focus on “negative facts,” possible data that is regularly
absent. These have proven to be reliable indicators of underlying universal
principles/operations. The fact that C&C does not mention this kind of
datum is, IMO, a pretty good indicator that it doesn’t take it seriously. Gaps
in the data are accidents, a position that any Associationist/Empiricist would
naturally gravitate towards. In fact, if you want a reliable indicator of A/E
tendencies look for a discussion of negative data. If it does not occur, I
would give better than even odds that you are reading the prose of a card
carrying A/Eer.
[11]
Linguists do differ on whether this is a viable
project in general (i.e. likely to be successful). But this is a matter of
taste, not argument. There is no way to know without trying.
[12]
For example, take a look at this
recent piece on the decline of the bee population and the factors behind
it. It ends with a nice discussion of the (often) inscrutable complexity of
interaction effects:
Let's add deer to the list of culprits,
then. And kudzu. It's getting to be a long list. It's also an
indication of what a complex system these bees are part of. Make one change
that you don't think has anything to do with them -- develop a new
pesticide, enact a biofuels subsidy, invent the motorized lawnmower -- and the bees turn out to feel it.
[13]
Actually, it used to be the only game in town. There are some urging that scientific
inquiry give up the aim of understanding. I will be writing a post on this
anon.
[14]
This btw, is not even descriptively correct given the myriad different kinds of
locality that linguists have identified. Indeed, so far as I know, there is no
linear bound between interacting morphemes anywhere in syntax (e.g. agreement,
binding, antecedence, etc.).
George Miller and Philip Johnson-Laird began their tome Language and Perception (1977) by complaining: "A repeated lament of those who would understand human nature is that everything is related to everything else", and ended it in an optimistic mood, BECAUSE they broke down language and perception into manageable pieces: "One exciting aspect of the study of language is the feeling, lacking in some areas of psychology, the problems are tractable, that the work has a cumulative effect, and that even mistakes can ultimately lead to advances in knowledge".
ReplyDeleteC&C? Classic bullshit (http://facultyoflanguage.blogspot.com/2013/03/science-bullshit.html): everything is related to everything else, heads I win, tails you lose.
Nice article great post comment information thanks for sharing
ReplyDeleteดูหนัง
The piece discussed here doesn't appear in Nature but in Nature Human Behaviour, a fairly new journal from the Nature group.
ReplyDeleteIt seems unfair to call Jackendoff "the daddy of constructionism" (in your footnote 8). If you want to identify a founding figure, it would be more reasonable to pick Adele Goldberg. If you insist on a male founding figure, Fillmore would be the obvious choice over Jackendoff.
Actually, grandad is Chomsky in LSLT and SS where rules are construction based. The idea that these were really fundamental units goes back to Emmon Bach, I believe. But you are right Goldberg is also a player, as is Culicover. Jackendof was useful to me as he clearly believes in UG and so is someone that both likes constructions and buys grammar as real. I have no idea if Goldberg buys this.
DeleteGreat piece!
ReplyDelete