Linguistic theory has a curious asymmetry, at least in
syntax. Let me explain.
Aspects
distinguished two kinds of universals, structural vs substantive. Examples of the former are commonplace: the
Subjacency Principle, Principles of Binding, Cross Over effects, X’ theory with
its heads, complements and specifiers; these are all structural notions that
describe (and delimit) how Gs function. We have discovered a whole bunch of
structural universals (and their attendant “effects”) over the last 60 years, and
they form part of the very rich legacy of the GG research program.
In contrast to all that we have learned about the structural
requirements of G dependencies, we have, IMO, learned a lot less about the
syntactic substances: What is a possible feature? What is a possible category?
In the early days of GG it was taken for granted that syntax, like phonology,
would choose its primitives (atomic elements) from a finite set of options.
Binary feature theories based on the V/N distinction allowed for the familiar
four basic substantive primitive categories A, N, V, and P. Functional
categories were more recalcitrant to systematization, but if asked, I think it
is fair to say that many a GGer could be found assuming that functional
categories form a compact set from which different languages choose different
options. Moreover, if one buys into the Borer-Chomsky thesis (viz. that
variation lives in differences in the (functional) lexicon) and one adds a dash
of GB thinking (where it is assumed that there is only a finite range of
possible variation) one arrives at the conclusion that there are a finite
number of functional categories that Gs choose from and that determine the (finite)
range of possible variation witnessed across Gs. This, if I understand things
(which I probably don’t (recall I got into syntax from philosophy not
linguistics and so never took a phonology or morphology course)), is a pretty
standard assumption within phonology tracing back (at least) to Sound Patterns. And it is also a pretty
conventional assumption within syntax, though the number of substantive
universals we find pale in comparison to the structural universals we have
discovered. Indeed, were I incline to be provocative (not something I am
inclined to be as you all know), I would say, that we have very few echt substantive universals (theories of
possible/impossible categories/features) when compared to the many many
plausible structural universals we have discovered.
Actually one could go further, so I will. One of the major ambitions
(IMO, achievements) of theoretical syntax has been the elimination of
constructions as fundamental primitives. This, not surprisingly, has devalued
the UG relevance of particular features (e.g. A’ features like topic, WH, or
focus), the idea being that dependencies have the properties they do not in
virtue of the expressions that head the constructions but because of the
dependencies that they instantiate. Criterial agreement is useful descriptively
but pretty idle in explanatory terms. Structure rather than substance is
grammatically key. In other words, the general picture that emerged from GB and
more recent minimalist theory is that G dependencies have the properties they
have because of the dependencies they realize rather than the elements that
enter into these dependencies.[1]
Why do I mention this? Because of a recent blog post by
Martin Haspelmath (here,
henceforth MH) that Terje Lohndal sent me. The post argues that to date
linguists have failed to provide a convincing set of atomic “building blocks”
on the basis of which Gs work their magic. MH disputes the following claim:
“categories and features are natural kinds, i.e. aspects of the innate language faculty” and they form “a “toolbox” of categories that
languages may use” (2-3). MH claims that there are few substantive proposals in
syntax (as opposed to phonology) for such a comprehensive inventory of
primitives. Moreover, MH suggests that this is not the main problem with the
idea. What is? Here is MP (3-4):
To my mind, a more serious problem
than the lack of comprehensive proposals is that linguistics has no clear
criteria for assessing whether a feature should be assumed to be a natural kind
(=part of the innate language faculty).
The typical linguistics paper
considers a narrow range of phenomena from a small number of languages (often
just a single language) and provides an elegant account of the phenomena,
making use of some previously proposed general architectures, mechanisms and
categories. It could be hoped that this method will eventually lead to convergent results…but I do not see
much evidence for this over the last 50 years.
And this failure is principled MH argues relying that it
does on claims “that cannot be falsified.”
Despite the invocation of that bugbear “falsification,”[2]
I found the whole discussion to be disconcertingly convincing and believe me
when I tell you that I did not expect this.
MH and I do not share a common vision of what linguistics is all about.
I am a big fan of the idea that FL is richly structured and contains at least
some linguistically proprietary information. MP leans towards the idea that
there is no FL and that whatever generalizations there might be across Gs are
of the Greenberg variety.
Need I also add that whereas I love and prize Chomsky
Universals, MH has little time for them and considers the cataloguing and
explanation of Greenberg Universals to be the major problem on the linguist’s
research agenda, universals that are best seen as tendencies and contrasts
explicable “though functional adaptation.” For MH these can be traced to
cognitively general biases of the Greenberg/Zipf variety. In sum, MH denies
that natural languages have joints that a theory is supposed to cut or that
there are “innate “natural kinds”” that give us “language-particular
categories” (8-9).
So you can see my dilemma. Or maybe you don’t so let me
elaborate.
I think that MH is entirely incorrect in his view of
universals, but the arguments that I would present would rely on examples that
are best bundled under the heading “structural universals.” The arguments that
I generally present for something like a domain specific UG involve structural
conditions on well-formedness like those found in the theories of Subjacency,
the ECP, Binding theory, etc. The arguments I favor (which I think are
strongest) involve PoS reasoning and insist that the only way to bridge the gap
between PLD and the competence attained by speakers of a given G that examples
in these domains illustrate requires domain specific knowledge of a certain
kind.[3]
And all of these forms of argument loose traction when the
issue involves features, categories and their innate status. How so?
First, unlike with the standard structural universals, I
find it hard to identify the gap between impoverished input and expansive
competence that is characteristic of arguments illustrated by standard
structural universals. PLD is not chock full of “corrected” subjacency
violations (aka, island effects) to guide the LAD in distinguishing long kosher
movements from trayf ones. Thus the fact that native speakers respect islands
cannot be traced to the informative nature of the PLD but rather to the
structure of FL. As noted in the previous post (here),
this kind of gap is where PoS reasoning lives and it is what licenses (IMO, the
strongest) claims to innate knowledge. However, so far as I can tell, this gap
does not obviously exist (or is not as easy to demonstrate) when it comes to
supposing that such and such a feature or category is part of the basic atomic
inventory of a G. Features are (often) too specific and variable combining
various properties under a common logo that seem to have little to do with one
another. This is most obvious for phi-features like gender and number, but it
even extends to categories like V and A and N where what belongs where is often
both squishy within a G and especially so across them. This is not to suggest
that within a given G the categories
might not make useful distinctions. However, it is not clear how well these
distinctions travel among Gs. What makes for a V or N in one G might not be
very useful in identifying these categories in another. Like I said at the
outset, I am not expert in these matters, but the impression I have come away
with after hearing these matters discussed is that the criteria for identifying
features within and across languages is not particularly sharp and there is
quite a bit of cross G variation. If this is so, then the particular properties
that coagulate around a given feature within a given G must be acquired via
experience with that that particular feature in that particular G. And if this
is so, then these features differ quite a bit in their epistemological status
from the structural universals that PoS arguments most effectively deploy.
Thus, not only does the learner have to learn which features his G exploits,
but s/he even has to learn which particular properties these features make
reference to, and this makes them poor fodder for the PoS mill.
Second, our theoretical understanding of features and
categories is much poorer than our understanding of structural universals. So
for example, islands are no longer basic “things” in modern theory. They are
the visible byproducts of deeper principles (e.g. Subjacency). From the little
I can tell, this is less so for features/categories. I mentioned the feature
theory underlying the substantive N,V,A,P categories (though I believe that
this theory is not that well regarded anymore). However, this theory, even if
correct, is very marginal nowadays within syntax. The atoms that do the
syntactic heavy lifting are the functional ones, and for this we have no good
theoretical unification (at least so far as I am aware). Currently, we have the
functional features we have, and there is no obvious theoretical restraint to
postulating more whenever the urge arises.
Indeed, so far as I can tell, there is no theoretical (and often,
practical) upper bound on the number of possible primitive features and from
where I sit many are postulated in an ad
hoc fashion to grab a recalcitrant data point. In other words, unlike what
we find with the standard bevy of structural universals, there is no obvious
explanatory cost to expanding the descriptive range of the primitives, and this
is too bad for it bleaches featural accounts of their potential explanatory
oomph.
This, I take it, is largely what MH is criticizing, and if
it is, I think I am in agreement (or more precisely, his survey of things
matches my own). Where we part company is what this means. For me this means
that these issues will tell us relatively little about FL and so fall outside
the main object of linguistic study. For MH, this means that linguistics will
shed little light on FL as there is nothing FLish about what linguistics
studies. Given what I said above, we can, of course, both be right given that
we are largely agreeing: if MH’s description of the study of substantive
universals is correct, then the best we might be able to do is Greenberg, and
Greenberg will tell us relatively little about the structure of FL. If that is
the argument, I can tag along quite a long way towards MH’s conclusion. Of
course, this leaves me secure in my conclusion that what we know about
structural universals argues the opposite (viz. a need for linguistically
specific innate structures able to bridge the easily detectable PoS gaps).
That said, let me add three caveats.
First, there is at least one apparent substantive universal
that I think creates serious PoS problems; the Universal Base Hypothesis (UBH).
Cinque’s work falls under this rubric as well, but the one I am thinking about
is the following. All Gs are organized into three onion like layers, what
Kleanthes Grohmann has elegantly dubbed “prolific domains” (see his thesis).
Thus we find a thematic layer embedded into an agreement/case layer embedded
into an A’/left periphery layer. I know
of no decent argument arguing against this kind of G organization. And if this
is true, it raises the question of why
it is true. I do not see that the class of dependencies that we find would
significantly change if the onion were inversely layered (see here
for some discussion). So why is it layered as it is? Note that this is a more
abstract than your typical Greenberg universal as it is not a fact about the
surface form of the string but the underlying hierarchical structure of the
“base” phrase marker. In modern parlance, it is a fact about the selection
features of the relevant functional heads (i.e. about the features (aka
substance) of the primitive atoms). It does not correspond to any fact about
surface order, yet it seems to be true. If it is, and I have described it
correctly, then we have an interesting PoS puzzle on our hands, one that deals
with the organization of Gs which likely traces back to the structure of FL/UG.
I mention this because unlike many of the Greenberg universals, there is no
obvious way of establishing this fact about Gs from their surface properties
and hence explaining why this onion like structure exists is likely to tell us
a lot about FL.
Second, it is quite possible that
many Greenberg universals rest on innate foundations. This is the message I
take away from the work by Culbertson & Adger (see here
for some discussion). They show how some order within nominals relating
Demonstratvies, Adjectives, Numerals and head Nouns are very hard to acquire
within an artificial G setting. They use this to argue that their absence as
Greenberg options has a basis in how such structures are learned. It is not entirely clear that this learning
bias is FL internal (it regards relating linear and hierarchical order) but it
might be. At any rate, I don’t want anything I said above to preclude the possibility
that some surface universals might reflect features of FL (i.e. be based on
Chomsky Universals), and if they do it suggests that explaining (some)
Greenberg universals might shed some light on the structure of FL.
Third, though we don’t have many good theories of features
or functional heads, a lazy perusal of the facts suggest that not just anything
can be a G feature or a G head. We find phi features all over the place. Among
the phi features we find that person, number and gender are ubiquitous. But if anything goes why don’t we find more obviously communicatively and
biologically useful features (e.g. the +/- edible feature, or the +/- predator
feature, or the +/- ready for sex feature or…). We could imagine all sorts of
biologically or communicatively useful features that it would be nice for
language to express structurally that we just do not find. And the ones that we
do find, seem from a communicative or biological point of view to often be idle
(gender (and, IMO, case) being the poster child for this). This suggests that
whatever underlies the selection of features we tend to see (again and again)
and those that we never see is more principled than anything goes. And if that
is correct, then what basis could there be for this other than some
linguistically innate proclivity to press these features as opposed to those
into linguistic service. Confession: I
do not take this argument to be very strong, but it seems obvious that the
range of features we find in Gs that do grammatical service is pretty small,
and it is fair to ask why this is so and why many other conceivable features
that we could imagine would be useful
are nonetheless absent.
Let me reiterate a point about my shortcomings I made at the
outset. I really don’t know much about features/categories and their uniform
and variable properties. It is entirely possible that I have underestimated
what GG currently knows about these matters. If so, I trust the comments
section will set things straight. Until that happens, however, from where I sit
I think that MH has a point concerning how features and categories operate
theoretically and that this is worrisome. That we draw opposite conclusions
from these observations is of less moment than that we evaluate the current
state of play in roughly the same way.
[1]
This is the main theme of On Wh Movement
and I believe what drives the unification behind Merge based accounts of FL.
[2]
Falsification is not a particularly good criterion of scientific adequacy, as
I’ve argued many times before. It is usually used to cudgel positions one
dislikes rather than push understanding forward. That said, in MH, invoking the
F word does not really play much more than an ornamental role. There are
serious criticisms that come into play.
[3]
I abstract here from minimalist considerations which tries to delimit the
domain specificity of the requisite assumptions. As you all know, I tend to
think that we can reduce much of GB to minimalist principles. The degree to
which this hope is not in vain, to that degree the domain specificity can be
circumscribed to whatever it is that minimalism needs to unify the apparently
very different principles of GB and the generalizations that follow from them.