I have a confession to make: I am not a fan of parameters. I
have come to dislike them for two reasons. First, they don’t fit well with what
I take to be Minimalist Program (MP) commitments. Second, it appears that the
evidence that they exist is relatively weak (see here
and here
for some discussion). Let me say a bit more about each point.
First the fit with MP: as Chomsky has rightly emphasized,
the more we pack into UG (the linguistically proprietary features of FL) the
harder it is to solve Darwin’s Problem in the domain of language. This is a
quasi-logical point, and not really debatable. So, all things being equal we
would like to minimize the linguistically specific content of FL. Parameters
are poster children of this sort of linguistically specific information. So,
any conception of FL that comes with a FL specified set of ways that Gs can
differ (i.e. a specification of the degrees of possible options) comes at an MP
cost. This means that the burden of proof for postulating FL internal parameters
is heavy and should be resisted unless faced with overwhelming evidence that we
need them.[1]
This brings us to the second point: it appears that the
evidence for such FL internal parameters is weak (or so my informants tell me
(when do fieldwork among variationists)). The classical evidence for parameters
comes from the observation that Gs differ wholesale not just retail. What I
mean by this is that surface changes come in large units. The classic example is
Rizzi’s elegant proposal linking the fixed subject constraint, pro-drop and
subject inversion. What made these proposals more than a little intriguing is
that they reduced what looked like very diverse G phenomena to a single source
that further appeared to be fixable on the basis of degree-0 PLD. This made
circumscribing macro-variation via parameters very empirically enticing. The problem was that the proposals that
linked together the variation in terms of single parameter setting differences
proved to be empirically problematic.
What was the main problem? It appears that we were able to
find Gs that dissociated each of the relevant factors. So we could get absence
of fixed subject condition effects without subject inversion or pro-drop. And we could find pro-drop without subject
inversion. And this was puzzling if these surface differences all reflect the
setting of a single parameter value.
I used to be very moved by these considerations but a recent
little paper on domestication has started me rethinking whether there may not
be a better argument for parameters, one that focuses less on synchronic facts
about how Gs differ and more on how Gs change over time. Let me lay out what I
have in mind, but first I want to take a short detour into the biology of
domestication because what follows was prompted by an article on animal
domestication (here).
This article illustrates the close conceptual ties between
modern P&P theories and biology/genetics. This connection is old news and
leaders in both fields have noted the links repeatedly over the years (think
Jacob, Chomsky). What is interesting for
present purposes is how domestication has the contours of a classic parameter
setting story.
It seems that Darwin was the first to note that
domestication often resulted in changes not specifically selected for by the
breeder (2):
Darwin noticed that, when it came
to mammals, virtually all domesticated species shared a bundle of
characteristics that their wild ancestors lacked. These included traits you
might expect, such as tameness and increased sociability, but also a number of
more surprising ones, such as smaller teeth, floppy ears, variable colour,
shortened faces and limbs, curly tails, smaller brains, and extended juvenile
behaviour. Darwin thought these features might have something to do with the
hybridisation of different breeds or the better diet and gentler ‘conditions of
living’ for tame animals – but he couldn’t explain how these processes would
produce such a broad spectrum of attributes across so many different species.
So, we choose for tameness and we get floppy ears. Darwin’s observation was strongly confirmed
many years later by a dissident Soviet biologist Dimitri Belyaev. Belyaev domesticated silver foxes. More specifically
(5):
He selected his foxes based on a single trait:
tameness, which he measured by their capacity to tolerate human proximity
without fear or aggression. Only 5 per cent of the tamest males and 20 per cent
of the tamest females were allowed to breed.
Within a few generations, Belyaev
started noticing some odd things. After six generations, the foxes began to wag
their tails for their caretakers. After 10, they would lick their faces. They
were starting to act like puppies. Their appearance was also changing. Their
ears grew more floppy. Their tails became curly. Their fur went from silver to
mottled brown. Some foxes developed a white blaze. Their snouts got shorter and
their faces became broader. Their breeding season lengthened. These pet foxes
could also read human intentions, through gestures and glances.
So, selecting for tameness, narrowly specified, brought in
its wake tail wagging, floppy ears etc. The reasonable conclusion from this
large scale change in traits is that they are causally linked. As the piece
puts it (5):
What the Belayaev results suggest
is that the manifold aspects of domestication might have a common cause in a
gene or set of genes, which occur naturally in different species but tend to be
selected out by adaptive and environmental pressures.
There is even a suggested mechanism: something called “neural
crest cells.” But the details do not matter really. What matters is the
reasoning: things that change together do so because of some common cause. In
other words, common change suggests
common cause. This is related to (but
is not identical to) the discussion about Gs above. The above looks at whether
G traits necessarily co-occur at any given time. The discussion here zeros in
on whether when they change they change together. These are different
diagnostics. I mention this, because the fact that the traits are not always
found together does not imply that they are would not change together.
The linguistic application of this line of reasoning is
found in Tony Kroch’s diachronic work. He argued that tracking the rates of
change of various G properties is a good way of identifying parameters.[2]
However, what I did not appreciate when I first read this is that the fact that
features change together need not imply that they must always be found
together. Here’s what I mean,
Think of dogs. Domestication brings with it floppy ears. So
select for approachability and you move from feral foxes with pointy ears to
domesticated foxes with floppy ears.
However, this does not mean that every domesticated dog will have floppy
ears. No, this feature can be detached from the others (and breeders can do
this while holding many other traits constant) even though without attempts to
detach floppy ears the natural change will be to floppy ears. So we can select
against a natural trait even if the underlying relationship is one that links
them. As the quote above puts it: traits that occur naturally together can be
adaptively selected out.
In the linguistic case, this suggests that even if a
parameter links some properties together and so if one changes they all will,
we need not find them together in any one G. What we find at any one time will
be due to a confluence of causes, some of which might obscure otherwise extant
causal dependencies.
So where does this leave us? Well, I mention all of this
because though I still think that MP considerations argue against FL internal
parameters, I don’t believe that the observation that Gs can treat these
properties atomically is a dispositive argument against their being
parametrically related. Changing together looks like a better indicator of
parametric relatedness than living together.
Last point: common change implies common cause. But common
cause need not rest on there being FL internal parameters. This is one way of
causally linking seemingly disparate factors. It is not clear that it is the
only or even the best way. What made Rizzi’s story so intriguing (at least for
me) is that it tied together simple changes visible in main clauses with
variation in (not PLD visible) embedded clause effects. So one could conclude
from what is available in the PLD to what will be true of the LD in
general. These cases are where parameter
thinking really pay off, and these still seem to be pretty thin on the ground,
as we might expect if indeed FL has no internal parameters.
[1]
There is another use of ‘parameter’ where the term is descriptive and connotes
the obvious fact that Gs differ. Nobody could (nor does) object to parameters
in this sense. The MP challenging one is the usage wherein FL prescribes a
(usually finite) set of options that (usually, finitely) circumscribe the
number of possible Gs. Examples include the pro-drop parameter, the V raising
parameter, the head parameter.
[2]
See his “Reflexes of grammar in patterns of language change.” You can get this
from his website here. Here’s a
nice short quote summing up the logic: “…since
V to I raising in English is lost in all finite clauses with tensed main verbs
and at the same rate, there must be a factor or factors which globally favor
this loss” (32).
No comments:
Post a Comment