In the last post I was quite critical of a piece that I thought mischaracterized the nature of linguistic inquiry of the Chomsky GG variety. I thought that I should do more than hector from the sidelines (though when Hector left the sidelines things did not end well for him). Here is an attempt to outline what is not (or should not be) controversial. It tries to outline the logic of GG investigations, the questions that orient it, and the rational history that follows from pursing these questions systematically. This is not yet a piece for the uninitiated, but fleshed out, I think it could serve as a reasonably good into into what one stripe of linguists do and why. There is need for more filling (illustrations of how linguists get beyond and build on the obvious). But, this is a place to start, IMO, and if one starts here lots of misconceptions will be avoided.
Linguistics (please note the ‘i’ here) revolves around three
questions:
(1) What’s
a possible linguistic structure in L?
(2) What’s
a possible G (for a given PLD)?
(3) What’s
a possible FL (for humans)?
These three questions correspond to three facts:
(1’) The fact of linguistic creativity (a native
speaker can and does regularly
produce and understand linguistic
objects never before encountered by her/him)
(2’) The fact of linguistic promiscuity (any kid
can acquire any language in (roughly) the same way as any other kid/language)
(3’) The fact of linguistic idiosyncrasy (humans
alone have the linguistic capacities they evidently have (i.e. both (1’) and
(2’) are species specific facts)
Three big facts, three big questions concerning those facts.
And three conclusions:
(1’’) Part of what makes native speakers
proficient in a language is their
cognitive
internalization of a recursive G
(2’’) Part of human biology specifies a species
wide capacity (UG) to acquire recursive Gs (on the basis of PLD)
(3’’) Humans alone have evolved the Gish
capacities and meta-capacities
specified in
(2’’) and (3’’) in the sense that our ancestors did not have this meta-capacity
(nor do other animals) and we do
IMO, the correctness of these conclusions is morally certain
(certain in the sense that though not logically
required, are trivially obvious and
indubitable once the facts in (1’-3’) are acknowledged. Or, to put this
another way, the only way to deny the trivial truths in (1’’-3’’) is to deny
the trivial facts in (1’-3’). Note, that this does not mean that these are the only
questions one can ask about language, but if
the questions in (1-3) are of interest to you (and nobody can force anybody to
be interested in any question!), then the consequences that follow from them
are sound foundations for further inquiry. When Chomsky claims that many of the
controversial positions he has advanced are not really controversial, this is what he means. He means that whatever
intellectual contentiousness exists regarding the claims above in no way detracts
from their truistic nature. Trivial and true!
Hence, intellectually
uncontroversial. He is completely right about this.
So, humans have a species specific dedicated capacity to
acquire recursive Gs. Is this all
that we can trivially deduce from obvious facts? Nope. We can also observe that
these recursive Gs have a side that we can informally call a meaning (M), and a
side that we can informally can a sound (S) (or, more precisely, an
articulation). So, the recursive G pairs meanings with sounds (in Chomsky’s
current formulation of the old Aristotelian observation (and yes, it is very
old because very trivial). And this unbounded pairing of Ms and Ss is
biologically novel in humans. Does this mean that anything we can call language rests on properties unique to
humans? Nope. All that follows (but it does follow trivially) is that this
unbounded capacity to pair Ms and Ss is biologically species specific. So, even
if being able to entertain thoughts is not
biologically specific and the capacity to produce sounds (indeed many many) is not biologically unique, the capacity to
pair Ms with Ss open-endedly IS. And part of the project of linguistics
is to explain (i) the fine structure of
the Gs we have that subvene this open-ended pairing, (ii) the UG (i.e.
meta-capacity) we have that allows for the emergence of such Gs in humans and (iii)
a specification of how whatever is distinctively linguistic about this
meta-capacity fits in with all the other non
linguistically proprietary and exclusively human cognitive and computational
capacities we have to form the complex capacity we group under the encyclopedia
entry ‘language.’
The first two parts of the linguistic project have been well
explored over the last 60 years. We know something about the kinds of recursive
procedures that particular Gs deploy and something about the possible kinds of operations/rules that
natural language Gs allow. In other words, we know quite a bit about Gs and UG.
Because of this in the last 25 years or so it has become fruitful to ruminate
about the third question: how it all came to pass, or, equivalently, why we
have the FL we have and not some other? It is a historic achievement of the
discipline of linguistics that this question is ripe for investigation. It is
only possible because of the success in discovering some fundamental properties
of Gs and UG. In other words, the Minimalist Program is a cause for joyous
celebration (cue the fireworks here). And not only is the problem ripe, there
is a game plan. Chomsky has provided a plausible route towards addressing this
very hard problem.
Before outlining the logic (yet again) let’s stop and appreciate
what makes the quetion hard. It’s hard because it requires distinguishing two
different kinds of universals; those that are cognitively and computationally
general from those that are linguistically proprietary, and to do this in a
principled way. And that is hard. Very very hard. For it requires thinking of
what we formally called UG is an interaction effect, and hence as not a unitary kind of thing. Let me
explain.
The big idea behind minimalism is that much of the
“mechanics” behind our linguistic facility is not linguistically parochial. Only
a small part is. In practical terms, this means that much of what we identified
as “linguistic universals” from about the mid 1960s to the mid 1990s are
themselves composed of operations only some of which are linguistically
proprietary. In other words, just as GB proposed to treat constructions as the
interaction of various kinds of more general mechanisms rather than as unitary
linguistic “rules” now minimalism is asking that we thing of universals as
themselves composed of various kinds of interacting computational and cognitive
more primitive operations only some of
which are linguistically proprietary.
In fact, the minimalist conceit is that FL is mostly comprised of computational
operations that are not specific to language. Note the ‘most.’ However, this
means that at least some part of FL is linguistically specific/special
(remember 3/3’/3’’ above). The research problem is to separate the domain
specific wheat from the domain general chaff. And that requires treating most
of the “universals” heretofore discovered as complexes and showing how their properties could arise from
the interaction of the general and specific operations that make them up. And
that is hard both analytically and empirically.
Analytically it is hard because it requires identifying
plausible candidates for the domain general and the linguistically proprietary
operations. It is empirically difficult for it requires expanding how we
evaluate our empirical results. An analogy with constructions and their
“elimination” as grammatical primitives might make this clearer.
The appeal of constructions is that they correspond fairly
directly to observable surface features of a language. Topicalizations have
topics which sit on the left periphery. Topics have certain semantic
properties. Topicalizations allow unbounded dependencies between the topic and
a thematic position, though not if the gap is inside an island and the gap is
null. Topicalization is similar to, but
different from Wh-questions, which are in some ways similar to focus
constructions, and in some ways not and all are in some ways similar to
relative clause constructions and in some ways not. These constructions have
all been described numerous times identifying more and more empirical nuances. Given
the tight connection between constructions and their surface indicators, they
are very handy ways of descriptively carving up the data because they provide
useful visible landmarks of interest. They earn their keep empirically and
philologically. Why then dump them? Why eliminate them?
Mid 1980s theory did so because they inadequately answer a
fundamental question: why do constructions that are so different in so many
ways nonetheless behave the same way as regards, say, movement? Ross established
that different constructions behaved similarly wrt island effects, so the
question arose as to why this was so. One plausible answer is that despite
their surface differences, various constructions are composed from similar
building blocks. More concretely, all the identified constructions involve a
‘Move Alpha’ (MA) component and MA is subject to locality conditions of the
kind that result in island effects if violated. So, why do they act the same?
Because they all use a common component which is grammatically subject to the
relevant locality condition.
Question asked. Question answered. But not without failing
to cover all the empirical ground constructions did. Thus, what about all the
differences? After all, nobody thinks that Topicalization and Relativization
are the same thing! Nobody. All that
is claimed is that they are formed exploiting a common sub-operation and that
is why they all conform to island
restrictions. How are the differences handled? Inelegantly. They are “reduced”
to “criterial conditions” that a head imposes on its spec or feature requirements
that a probe imposes on its goal. In other words, constructions are factored
into the UG relevant part (subject to a specific kind of locality) and the G
idiosyncratic part (feature/criteria requirements between heads and phrases of
a certain sort). In other words, constructions are “eliminated” in the sense of
being grammatically basic, not in being objects of the language with the
complex properties they have.
Constructions, in other words, are the result of the complex
interactions of more primitive Gish operations/features/principles. They are
interaction effects, with all the complexity this entails.
But this factorization is not enough. One more thing is
required to make deconstructing constructions into their more basic constituent
parts all theoretically and empirically worthwhile. It is required that we
identify some signature properties of the more abstract MA that is a
fundamental part of the other constructions, and that’s what all the fuss about
successive cyclicity was all about. It was interesting because it provided a
signature property of the movement operation: what appears to be unbounded movement is actually composed
of small steps, and we were able to track those steps. And that was/is a big
deal. It vindicated the idea that we should analyze complex constructions as
the interaction of more basic operations.
Let’s now return to the problem of distilling the domain
general from the domain specific wrt FL. This will be hard for we must identify
plausible operations of each type, show that in combination they yield
comparable empirical coverage as earlier UG principles, and identify some
signature properties of the domain specific operations/principles. All of this
is hard to do, and hence the intellectual interest of the problem.
So what is Chomsky’s proposed route to this end? His
proposal is to take recursive hierarchy as the single linguistically specific
property of FL. All other features of FL are composite. The operation that
embodies this property is, of course, Merge. The conceit is that the simplest
(or at least one very simple)
operation that embodies this property also has other signature properties we
find universally in Gs (e.g. embodies both structure building and displacement,
provides G format for interpretation and reconstruction effects, etc.[1]).
So identify the right distinctive
operation and you get as reward an account for why Gs display some signature
properties.
Does this mean that FL only contains Merge? No. If true, it
means that Merge is the only linguistically distinctive operation of this
cognitive component. FL has other principles and operations as well. So feature
checking is a part of FL (Gs do this all the time and is the locus of G differences),
though it is unlikely that feature checking is an operation proprietary to FL (even though Gs do it
and FL exploits it). Minimality is likely an FL property, but one hopes that it
is just a special instance of a more general property that we find in other
domains of cognition (e.g. similarity based interference).[2]
So too with phases (one hopes), which function to bound the domain of
computations, something that well designed systems will naturally do. Again,
much of the above are promissory notes, not proposals, but hopefully you get
the idea. Merge in combination with these more generic cognitive and
computational operations work in concert to deliver an FL.
IMO (not widely shared I suspect), the program is doing
quite well in providing a plausible story along these lines. Why do we have the
FL we have? Because it is the simplest (or very simple) combination of generic
computational and cognitive principles plus one very simple linguistically
distinctive operation that yields a most distinctive feature of human
linguistic objects, unbounded hierarchy.
Why is simple important? Because it is a crucial ingredient
of the phenotypic gambit (see here).
We are assuming that simple and evolvable are related. Or, more exactly, we are
taking phenotypically simple as proxy for genetically simple as is typical in a
lot of work on evolution.[3]
So linguistics starts from three questions rooted in three
basic facts and resulting in three kinds of research; into G, into UG and into
FL. These questions build on one another (which is what good research questions
in healthy sciences do). The questions get progressively harder and more
abstract. And, answers to later questions prompt revisions of earlier
conclusions. I would like to end this over long disquisition with some
scattered remarks about this.
As noted, these projects take in one another’s wash. In
particular, the results of earlier lines of inquiry are fodder for later ones.
But they also change the issues. MP refines the notion of a universal,
distancing it even more than its GB ancestor does from Greenbergian
considerations. GB universals are quite removed from the simple observations
that motivate a Greenberg style universal recall: they are largely based on negative data). However, MP universals
are even some distance from classical GB universals in that MP worries the distinction
between those cognitive features that are linguistically proprietary and those that
are not in a way that GB seldom (never?) did. Consequently, MP universals (e.g.
Merge) are even more “abstract” than their GBish predecessors, which, of
course, makes them more remote from the kind of language particular data that linguists
are trained to torture for insights.
Or to put this another way: MP is necessarily less
philologically focused than even GB was. The focus of inquiry is explicitly the
fine structure of FL. This was also true of earlier GBish theories, but, as
I’ve noted before, this focus could be obscured. The philologically inclined
could have their own very good reasons for “going GB,” even absent mentalist
pretentions. MP’s focus on the structure of FL makes it harder (IMO,
impossible) to evade a mentalist focus.[4]
A particularly clear expression of the above is the MP view
of parameters. In GBish accounts parameters are internal properties of FL that
delimit the class of possible Gs. Indeed, Chomsky made a big deal of the fact
that in P&P theories there were a finite number of Gs (though perhaps a
large finite number) dependent on the finite number of choices for values FL
allowed. This view of parameters fit well with the philologists interest in variation,
for it proposed that variation was severely confined, limited to a finite
number of possible differences. On this
view, the study of variation feeds into a study of FL/UG by way of a study of
the structure of the finite parameter space. So, investigating different
languages and how they vary is, on this view, the obvious way of studying the
parametric properties of FL.
But, from an MP point of view, parameters are suspect.
Recall, the conceit is that the less linguistic idiosyncrasy built into FL, the
better. Parameters are very very idiosyncratic (is TP or CP a bounding node?
Are null subjects allowed?). So the idea of FL internal parameters is MP unwelcome. Does this deny that there is
variation? No. It denies that variation is parametrically constrained.
Languages differ, there is just no finite limit to how they might.
Note that this does not imply that anything goes. It is
possible that no Gs allow some feature without it being the case that there is
a bound on what features a G will allow. So invariances (aka: principles) are
fine. It’s parameters that are suspect. Note, that on this view, the value of
work on variation needs rethinking. It may tell you little about the internal
structure of FL (though it might tell you a lot about the limits of the
invariances).[5]
Note further that this further drives a wedge between
standard linguistic research (so much is dedicated to variation and typology)
and the central focus of MP research, the structure of FL. In contrast to
P&P theories where typology and variation are obviously relevant for the study of FL, this is less obvious (I
would go further) in an MP setting. I tend to think that this fact influences
how people understand the virtues and achievements of MP, but as I’ve made this
point before, I will leave it be here.
Last, I think that the MP problematic encourages a healthy
disdain for surface appearances, even more so than prior GBish work. Here’s
what I mean: if your interest is in simplifying FL and relating the distinctive
features of language to Merge then you will be happy downplaying surface
morphological differences. So, for example, if MP leads you to consider a Merge
based account of binding, then reflexive forms
(e.g. ‘himself’) are just the morphological residues of I-merge. Do they have
interesting syntactic properties? Quite possibly not. They are just surface
detritus. Needless to say, this way of describing things can be seen, from
another perspective, as anti-empirical (believe me, I know whereof I write). But
if we really think that all that is G distinctive leads back to Merge then if
you think that c-command is a distinctive product of Merge and you find this in
binding then you will want to unify I-merge and binding theory so as to account
for the fact that binding requires c-command. But this will then mean ignoring
many differences between movement and binding, and one way to do this is to
attribute the differences to idiosyncratic “morphology” (as we did in
eliminating constructions). In other words, from an MP perspective there are
reasons to ignore some of the data that linguists hold so dear.
There is a line (even Chomsky has pushed it) that MP offers
nothing new. It is just the continuation of what we have always done in GG.
There is one sense in which I think that this is right. The questions asked linguistics
have investigated follow a natural progression if one’s interest is in the
structure of FL. MP focuses on the next natural question to ask given the prior
successes of GG. However, the question itself is novel, or at least it is
approachable now in ways that it wasn’t before. This has consequences. I
believe that one of the reasons behind a palpable hostility to MP (even among
syntacticians) is the appreciation that it does change the shape of the board.
Much of what we have taken for granted is rightly under discussion. It is like
the shift away from constructions, but in an even more fundamental way.
[2]
I discuss this again in a forthcoming post. I know you cannot wait.
[3]
In other words, this argument form is
not particularly novel when applied to language. As such one should beware to
avoid methodological dualism and not subject the linguistic application of this
gambit to higher standards than generally apply.
[5]
A personal judgment: I don’t believe that cross-linguistic study has generally
changed our views about the principles. But this is very much a personal view,
I suspect.
This is "The No Title Yet Blues"?
ReplyDeleteYou see, it really is time to aestivate.
DeleteTo underscore your point about (2') above, linguistic promiscuity, I seem to recall Chomsky sometimes emphasizing that you can take a child from even the most genetically isolated populations and witness the same result (thereby applying even more pressure to traditional adaptationist stories).
ReplyDeleteWow, absolutely fantastic blog. I am very glad to have such useful information.
ReplyDeleteThanks For more information visit
ดูหนัง
The reason that many people think it is very good. I want to read this as well. Which I think everyone agrees with. พืช
ReplyDelete