I confess that I did not read the Evans and Levinson article
(EL) (
here) when it first came out. Indeed, I didn’t read it until last
week. As you might guess, I was not
particularly impressed. However, not necessarily for the reason you might
think. What struck me most is the crudity of the arguments aimed at the
Generative Program, something that the (reasonable) commentators (e.g. Baker,
Freidin, Pinker and Jackendoff, Harbour, Nevins, Pesetsky, Rizzi, Smolensky and
Dupoux a.o.) zeroed in on pretty quickly. The crudity is a reflection, I
believe, of a deep seated empiricism, one that is wedded to a rather
superficial understanding of what constitutes a possible “universal.” Let me
elaborate.
EL adumbrates several conceptions of universal, all of which
the paper intends to discredit. EL distinguishes substantive universals from
structural universals and subdivides the latter into Chomsky vs Greenberg formal
universals. The paper’s mode of argument is to provide evidence against a
variety of claims to universality by citing data from a wide variety of
languages, data that EL appears to believe, demonstrate the obvious inadequacy
of contemporary proposals. I have no expertise in typology, nor am I
philologically adept. However, I am pretty sure that most of what EL discuss
cannot, as it stands, broach many of
the central claims made by Generative Grammarians of the Chomskyan stripe. To
make this case, I will have to back up a bit and then talk on far too long.
Sorry, but another long post. Forewarned, let’s begin by asking a question.
What are Generative Universals (GUs) about? They are intended to be in the first
instance, descriptions of the properties of the Faculty of Language (FL). FL
names whatever it is that humans have as biological endowment that allows for
the obvious human facility for language. It is reasonable to assume that FL is
both species and domain specific. The species specificity arises from the
trivial observations that nothing does language like humans do (you know: fish
swim, birds fly, humans speak!). The domain specificity is a natural conclusion
from the fact that this facility arises in all humans pretty much in the same
way independent of other cognitive attributes (i.e. both the musical and the
tone deaf, both the hearing impaired and sharp eared, both the mathematically
talented and the innumerate develop language in essentially the same way). A natural conclusion from this is that humans
have some special features that other animals don’t as regards language and
that human brains have language specific “circuits” on which this talent rests.
Note, this is a weak claim: there is something
different about human minds/brains on which linguistic capacity supervenes.
This can be true even if lots and lots of our linguistic facility exploits the
very same capacities that underlie other forms of cognition.
So there is something special about human minds/brains as
regards language and Universals are intended to be descriptions of the powers
that underlie this facility; both the powers of FL that are part of general
cognition and those unique to linguistic competence. Generativists have proposed elaborating the
fine structure of this truism by investigating the features of various natural
languages and, by considering their properties, adumbrating the structure of
the proposed powers. How has this been done? Here again are several trivial
observations with interesting consequences.
First, individual languages have systematic properties. It
is never the case that, within a given language, anything goes. In other words, languages are rule governed.
We call the rules that govern the patterns within a language a grammar. For
generativists, these grammars, their properties, are the windows into the structure
of FL/UG. The hunch is that by studying the properties of individual grammars,
we can learn about that faculty that manufactures grammars. Thus, for a generativist, the grammar is the relevant unit of
linguistic analysis. This is important. For grammars are NOT surface
patterns. The observables linguists have tended to truck in relate to patterns
in the data. But these are but way stations to the data of interest: the grammars that generate these patterns. To talk about FL/UG one needs to study
Gs. But Gs are themselves inferred from
the linguistic patterns that Gs generate, which are themselves inferred from
the natural or solicited bits of linguistic productions that linguists bug
their friends and collaborators to cough up. So, to investigate FL/UG you need
Gs and Gs should not be confused with their products/outputs, only some of
which are actually perceived (or perceivable).
Second, as any child can learn any natural language, we are
entitled to conclude from the intricacies of any given language to powers of
FL/UG capable of dealing with such intricacies.
In other words, the fact that a given language does NOT express property
P does not entail that FL/UG is not sensitive to P. Why? Because a description
of FL/UG is not an account of any given language/G but an account of linguistic
capacity in general. This is why one can
learn about the FL/UG of an English speaker by investigating the grammar of a
Japanese speaker and the FL/UG of both by investigating the grammar of a
Hungarian, or Swahili, or Slave speaker. Variation among different grammars is
perfectly compatible with invariance in FL/UG, as was recognized from the
earliest days of Generative Grammar. Indeed, this was the initial puzzle: find
the invariance behind the superficial difference!
Third, given that some
languages display the signature properties of recursive rule systems (systems
that can take their outputs as inputs), it must be the case that FL/UG is
capable of concocting grammars that have this property. Thus, whatever G an
individual actually has, that individual’s FL/UG is capable of producing a
recursive G. Why, because that individual could
have acquired a recursive G even if that individual’s actual G does not
display the signature properties of recursion. What are these signature
properties? The usual: unboundedly large
and deep grammatical structures (i.e. sentences of unbounded size). If a given
language appears to have no upper bound on the size of its sentences, then it's
a sure bet that the G that generates the structures of that language is
recursive in the sense of allowing structures of type A as parts of structures
of type A. This, in general will suffice to generate unboundedly big and deep
structures. Examples for this type of recursion include conjunction,
conditionals, embedding of clauses as complements of propositional attitude
verbs, relative clauses etc. The reason
that linguists have studied these kinds of configurations is precisely because
they are products of grammars with this interesting property, a property that
seems unique to the products of FL/UG, and hence capable of potentially telling
us a lot about the characteristics of FL/UG.
Before proceeding, it is worth noting that the absence of
these noted signature properties in a given language L does not imply that a
grammar of L is not basically recursive.
Sadly, FL seems to leap to this conclusion (443). Imagine that for some
reason a given G puts a bound of 2 levels of embedding on any structure in L.
Say it does this by placing a filter (perhaps a morphological one) on more
complex constructions. Question: what is the correct description of the grammar
of L? Well, one answer is that it does
not involve recursive rules for, after all, it does not allow unbounded
embedding (by supposition). However,
another perfectly possible answer is that it allows exactly the same kinds of
embedding that English does modulo this
language specific filter. In that
case the grammar will look largely like the ones that we find in languages like
English that allow unbounded embedding, but with the additional filter. There
is no reason just from observing that
unbounded embedding is forbidden to conclude that the grammar of this
hypothetical language L (aka Kayardild or Piraha) has a grammar different in kind from the grammars we attribute
to English, French, Hungarian, Japanese etc. speakers. In fact, there is reason to think that the Gs
that speakers of this hypothetical language have does in fact look just like
English etc. The reason is that FL/UG is
built to construct these kinds of grammars and so would find it natural to do
so here as well. Of course L would seem
to have an added (arbitrary) filter on the embedding structures, but otherwise
the G would look the same as the G of more familiar languages.
An analogy might help.
I’ve rented cars that have governors on the accelerators that cap speed
at 65 mph. The same car without the
governor can go far above 90 mph. Question: do the two cars have the same
engine? You might answer “no” because of
the significant difference in upper limit speeds. Of course, in this case, we
know that the answer is “yes”: the two cars work in virtually identical ways,
have the very same structures but for the
governor that prevents the full velocity potential of the rented car from
being expressed. So, the conclusion that
the two cars have fundamentally different engines would be clearly
incorrect. Ok: swap Gs for engines and
my point is made. Let me repeat it: the
point is not that the Gs/engines
might be different in kind, the point is that simple observation of the
differences does not license the conclusion that they are (viz. you are not
licensed to conclude that they are just finite state devices because they don’t
display the signature features of unbounded recursion, as EL seems to). And, given what we know about Gs and engines
the burden of proof is on those that conclude from such surface differences to
deep structural differences. The
argument to the contrary can be made, but simple observations about surface
properties just doesn’t cut it.
Fourth, there are at least two ways to sneak up on
properties of UGs: (i) collect a bunch and see what they have in common (what
features do all the Gs display) and (ii) study one or two Gs in great detail
and see if their properties could be acquired from input data. If any could not
be, then these are excellent candidate basic features of FL/UG. The latter, of
course, is the province of the POS argument.
Now, note that as a matter of logic the fact that some G fails to have
some property P can in principle falsify a claim like (i) but not one like
(ii). Why? Because (i) is the claim that
every G has P, while (ii) is the claim that if G has P then P is the
consequence of G being the product of FL/UG. Absence of P is a problem for
claims like (i) but, as a matter of logic, not for claims like (ii) (recall, If
P then Q is true if P is false). Unfortunately,
EL seems drawn to the conclusion that PàQ is falsified if –P is
true. This is an inference that other papers (e.g. Everett’s Piraha work) are
also attracted to. However, it is a non-sequitur.
EL recognizes that arguing from the absence of some property
P to the absence of Pish features in UG does not hold. But the paper clearly wants to reach this
conclusion nonetheless. Rather than denying the logic, EL asserts that “the
argument from capacity is weak” (EL’s
emphasis). Why? Because EL really wants all universals to be of the (i)
variety, at least if they are “core” features of FL/UG. As these type (i)
universals must show up in every G if they are indeed universal, absence to
appear in one grammar is sufficient to call into question its universality. EL
is clearly miffed that Generativists in general and Chomsky in particular would
hold a nuanced position like (ii). EL seems to think that this is cheating in some
way. Why might they hold this? Here’s
what I think.
As I discussed extensively in another place
(here), everyone
who studies human linguistic facility appreciates that competent speakers of a
language know more than they have been exposed to. Speakers are exposed to bits of language and
from this acquire rules that generalize to novel exemplars of that
language. No sane observer can dispute
this. What’s up for grabs is the nature
of the process of generalization. What separates empiricists from rationalists
conceptions of FL/UG is the nature of these inductive processes. Empiricists
analyze the relevant induction as a species of pattern recognition. There are
patterns
in the data and these are
generalized to all novel cases.
Rationalists appreciate that this is an option, but insist that there
are other kinds of generalizations, those based on the architectural properties
(Smolensky and Dupoux’s term) of the generative procedures that FL/UG allow.
These procedures need not “resemble” the outputs they generate in any obvious
way and so conceiving this as a species of pattern recognition is not useful
(again, see
here for more discussion).
Type (ii) universals fit snugly into this second type, and so
empiricists won’t like them. My own
hunch is that an empiricist affinity for generalizations based on patterns in
the data lies behind EL’s dissatisfaction with “capacity” arguments; they are
not the sorts of properties that inspection of cases will make manifest. In
other words, the dissatisfaction is generated by Empiricist sympathies and/or
convictions which, from where I sit, have no defensible basis. As such, they
can be and should be discounted. And in a rational world they would be. Alas…
Before ending, let me note that I have been far too generous
to the EL paper in one respect. I said
at the outset that its arguments are crude. How so? Well, I have framed the paper’s main point as
a question about the nature of Gs. However, most of the discussion is framed
not in terms of the properties of Gs they survey but in terms of surface forms
that Gs might generate. Their discussion
of constituency provides a nice example (441).
They note that some languages display free word order and conclude from
this that they lack constituents.
However, surface word order facts cannot possibly provide evidence for
this kind of conclusion, it can only tell us about surface forms. It is
consistent with this that elements that are no longer constituents on the
surface were constituents earlier on and were then separated, or will become
constituents later on, say on the mapping to logical form. Indeed, in one sense of the term constituent,
EL insists that discontinuous expressions are
such for they form units of interpretation and agreement. The mere fact that
elements are discontinuous on the surface tells us nothing about whether they
form constituents at other levels. I would not mention this were it not the
classical position within Generative Grammar for the last 60 years. Surface
syntax is not the arbiter of constituency, at least if one has a theory of
levels, as virtually every theory that sees grammars as rules that relate
meaning with sounds assumes (EL assumes this too). There is nary a grammatical structure in EL
and this is what I meant be my being overgenerous. The discussion above is
couched in terms of Gs and their features. In contrast, most of the examples in
EL are not about Gs at all, but about word strings. However, as noted at the
outset, the data relevant to FL/UG are Gs and the absence of Gish examples in
EL makes most of EL’s cited data irrelevant to Generative conceptions of FL/UG.
Again, I suspect that the swapping of string data for G data
simply betrays a deep empiricism, one that sees grammars as regularities over
strings (string patterns) and FL/UG as higher order regularities over Gs.
Patterns within patterns within patterns. Generativists have long given up on
this myopic view of what can be in FL/UG.
EL does not take the Generative Program on its own terms and show that
it fails. It outlines a program that Generativists don’t adopt and then shows
that it fails by standards it has always rejected using data that is nugatory.
I end here: there are many other criticisms worth making
about the details, and many of the commentators of the EL piece better placed
than me to make them do so. However, to my mind, the real difficulty with EL is
not at the level of detail. EL’s main point as regards FL/UG is not wrong, it
is simply besides the point. A lot of
sound and fury signifying nothing.