As FoLers know, I do not believe that linguists (or at least
syntacticians) highly prize theoretical work. Just the opposite, in fact. This
is why, IMO, the field has tolerated (rather than embraced) the minimalist
project (MP) and why so many professionals believe MP to have largely been a
failure despite, what (again IMO) is its evident overall successes. As I’ve
argued this at length before, I will not do so again here. Rather I would like
to report on an interesting paper that I have just re-read that tries to
elucidate three distinct kinds of theoretical work. The paper is an old one
(published in 2000). It’s called “Thinking about Mechanisms” and the authors,
three philosophers, are Peter Machamer, Lindley Darden and Carl Craven (MDC). Here
is a link. The paper concentrates on elucidating the notion of a mechanism and
argues that it is the key explanatory notion within neurobiology and molecular
genetics. The discussion is interesting and I recommend it. In what follows, I
would like to pick out some points at random that MDC makes and relate it to
linguistic theorizing. This, I hope, will encourage others to look more kindly
on theoretical work.
MDC defines the notion of a as follows:
Mechanisms are entities and
activities organized such that they are productive of regular changes from
start or set-up to finish or termination conditions… To give a description of a
mechanism for a phenomenon is to explain that phenomenon, i.e. to explain how
it was produced. (3)
So, mechanisms are theoretical constructs whose features
(the “entities,” their “properties,” and the “activities” they partake in) explain how phenomena of interest arise.
So mechanisms produce phenomena (in biology, in real time) in virtue of the
properties of their parts and the activities they engender. [1]
MDC divides a mechanistic description into three parts: (i)
Set-up Conditions, (ii) Termination Conditions and (iii) Intermediate
activities.
The first, set-up conditions, are “idealized descriptions”
of the beginning of the mechanism. Termination conditions are “idealized states
or parameters describing a privileged endpoint.” The intermediate steps provide
an account of how one gets from the initial set-up to the termination
conditions, which describe the phenomenon of interest. (11-12)
This should all sound vaguely familiar. To me it sounds very
much like what linguists do in providing a grammatical derivation of a sentence
of interest. We start with an initial structure (e.g. a D(eep) S(tructure)
representation and explain some feature of a sentence (e.g why the syntactic
subject is interpreted as a thematic object) by showing how various operations
(i.e. transformations) lead from the initial to the termination state. Doing
this explains why the sentence of
interest has the properties to be explained. Indeed, it has them in virtue of being the endpoint of the
licit derivation provided.
Note too that both mechanisms and GPs focus on idealized situations. GPs describes the
linguistic competence of an ideal
speaker-hearer or the FL of an idealized LAD. So too with biological
mechanisms. They describe idealized
hearts or kidneys or electrical conduction at a synapse. Actual instances are
not identical to these, though they function in the same ways (it is hoped). No
two hearts are the same, yet every idealized heart is identical to any other.
This said, derivations are not actually mechanisms in MDC’s
sense for they do not operate in real time (unlike the one’s biologists are
typically describing (e.g. synaptic transmission or protein synthesis)).
However, generative procedures (GP) are the “mechanisms” of interest within GG
for it is (at least in large part) in virtue of the properties of GPs that we
explain why native speakers judge the linguistic objects in their native
languages as they do and why Gs have the properties they have. Furthermore, as
in biology, the aim of linguistics is to elucidate the basic properties of GPs
and try to explain why they have the properties they have and not others. So,
GPs in linguistics are analogous to mechanisms in other parts of biology.
Phenomena are interesting exactly to the degree that they serve to shine light
on the fine structures of mechanisms in biology. Ditto with GPs in linguistics.
MDC notes that a decent way to write a history of biology is
to trace out the history of its mechanisms. I cannot say whether this is so for
the rest of biology, but as regards linguistics, there are many worse ways of
tracing the history of modern GG than by outlining how the notion of GP has
evolved over the last 60 years. There is
a reasonable argument to be made (and I have tried to make it (see here
and following four posts) that the core understanding of GP has become simpler
and more general in this period, and that the Minimalist Program is
conservative extension of prior work describing the core properties of a human
linguistic GP. Not surprisingly, this has analogues in the other kinds of
biological theorizing MDC discusses.
So, the core explanatory
construct in biology according to MDS is the mechanism. As MDC puts it: “…a
mechanistic explanation…renders a phenomenon intelligible…Intelligibility
arises not from an explanation’s correctness, but rather from an elucidative
relation between the explanans (the set-up conditions and the intermediate
entities and activities) and the explanadum (the termination condition or the
phenomenon to be explained)” (21).
MDC is at pains to point out that this “elucidative
relation” holds regardless of the
accuracy of the description. So explanatory potential
is independent of truth, and what theory aims at are theories with such potential.
Explanatory potential relies on elucidating how something could work, not how it does. The gap between possibility and
actuality is critical for the theoretical enterprise. It’s what allows it a
certain degree of autonomy.
For such autonomy to be possible it is critical to
appreciate that explanatory potential (what I have elsewhere called “oomph”) is
not reducible to regularity of behavior. Again MDC (21-22):
We should not be tempted to follow
Hume and later logical empiricists into thinking that the intelligibility of
activities (or mechanisms) is reducible to their regularity. Description or
mechanisms render the end stage intelligible by showing how it is produced by
bottom out entities and activities. To explain is not merely to redescribe one
regularity as a series of several. Rather, explanation involves revealing the productive relation. It is the
unwinding, bonding, and breaking that explain protein synthesis; it is the
binding, bending, and opening that explain the activity of Na+
channels. It is not the regularities that explain the activities but the
activities that sustain the regularities.
In other words, mechanisms are not (statistical) summaries of what something regularly does.
Regularities/summaries do not (and cannot) explain, and as mechanisms aim to
explain they must be more than such summaries no matter how regular. Mechanisms
outline how a phenomenon has (or could have) arisen, and this requires
outlining the structures and principles that mechanisms deploy to “generate”
the phenomenon of interest.[2]
Importantly, it is the relative independence of explanatory
potential from truth that allows theory to have an independent existence. MDC
suggest three different grades of theoretical involvement summed up in three
related but different questions: How possibly? How plausibly? How actually? Let
me elaborate.
Explanations are hard. They are hard precisely because they
must go beyond recapitulating the phenomenon of interest. Finding the right
concepts and putting them together in the right way can be demanding. Here is
an example of what I mean (see here
for an earlier discussion).
The Minimalist Program (MP) has largely ignored ECP effects
of the argument/adjunct asymmetry variety.
Why so? I would contend it is because it is quite unclear how to
understand these effects in MP terms. In this respect ECP effects contrast with
island effects. There are MP compatible versions of the latter, largely
recapitulating earlier versions of Subjacency Theory. IMO, such accounts are
not particularly elegant, nor particularly insightful. However it is possible to pretty directly trade
bounding nodes for phases, escape hatches for phase edges and Subjacency
Principles for Phase Impenetrability Conditions in largely a one for one swap
and thereby end up with a theory no worse than the older GB stories but cast in
an acceptable MP idiom. This does not constitute a great theoretical leap
forward (and so, if this is correct, for these phenomena thinking a la MP does
not deepen our understanding), but at least it is clear how island effects could hold within an MP style conception
of G. They reduce to Subjacency Effects albeit with all the parts suitably renamed.
In other words, the theoretical and conceptual resources of MP are adequate to
recapitulate (if not much illuminate) the theoretical and conceptual resources
of earlier GB.
This is not so for ECP effects. Why not? Well for several
reasons, but the two big ones are that the ECP is a trace licensing condition
and the technology behind it appears to run afoul of inclusiveness. Let’s
discuss each point in turn.
The big idea behind the ECP is that traces are grammatically
toxic unless tamed. They can be tamed by being marked (gamma-marked) by a local
antecedent throughout the course of the derivation. The distinction between
arguments and adjuncts arises from the assumption that argument A’-chains can
be reduced, thereby eliminating their -gamma marked carriers and thereby not
cancelling the derivation at LF (recall, -gamma marked expressions kill a
derivation). So, traces are toxic, +gamma marking tames them, and deletion acts
differently for adjuncts and arguments which is why the former are more restricted
than the latter. This, plus a kind of uniformity principle on chains (not a
great or intuitive principle IMO, but maybe this is just me) which invidiously
distinguishes adjunct from argument chains,[3]
yields the desired empirical payoff.
Given the complexity of the ECP data, this is an
achievement. Whether it constitutes much of an explanation is something people
can disagree about. However, whatever it’s value, it runs afoul of what appear
to be basic MP assumptions. For example, MP eschews traces, hence there is
little conceptual place for a module of the grammar whose job it is to license
them. Second, MP derivations reject adding little diacritics to expressions in
the course of a derivation. If indices are technicalia non grata, what to make
of +/-gamma marks. Last, MP derivations are taken to be monotonic (No
Tampering), hence frowning upon operations that delete information on the “LF”
side of a derivation. But deleting –gamma marked traces is what “explains” the
argument/adjunct difference. So, the standard GB story doesn’t really fit with
basic MP assumptions and this makes it fruitful to ask how ECP effects could possibly be modeled in MP style
accounts. And this is a job for
theorists: to come up with a story that could
fit, to find the right combination of MP compatible concepts that would yield roughly the right empirical outcomes.[4]
The theoretical challenge, then, in the first instance, is to explain how to possibly fit ECP effects into an MP
setting, given the elimination of traces and a commitment to derivational
monotonicity.
There are additional why questions out there begging for how
possible scenarios: e.g. Why case? Why are phrase markers organized so that
theta domains are within case/agreement domains that are within A’ (information
structure) domains? Why are reflexivization and pronominalization in
complementary distribution? Why is selection and subcategorization so local? I
could go on and on. These why questions are hard not because we have tons of
possible explanatory options but cannot figure out which one to run with, but
because we have few candidate theories to run with at all. And that is a theoretical challenge, not just an
empirical one. It’s in situations like these that how-possibly becomes a
pressing and interesting issue. Sadly, it is also something that many working
syntacticians barely attend to.
MDC notes a second level of theoretical involvement: how plausible is a certain possible story. Clearly to ask this
requires having a how possible scenario or two sketched out. It is tempting to
think that plausibility is largely a matter of empirical coverage. But I would
like to suggest otherwise. IMO, plausibility is evaluated along two dimensions:
how well the novel theory covers the older (gross) empirical terrain and how
many novel lines of inquiry it prompts. A theory is plausible to the degree
that it largely conserves the results
and empirical coverage of prior theory (what one might call the “stylized
facts”) and the degree to which it successfully explains things that earlier
theory left stipulative. Again, let me illustrate.
Clearly, plausibility is more demanding than possibility. Plausible
theories not only explain, but have verisimilitude (we think that they have a
decent chance of being correct). What are the marks of a plausible account?
Well, they cover roughly the same empirical territory of the theory they are
replacing and they explain what
earlier theory stipulated. Here are a couple of examples.
I believe that movement theories of binding and control are
plausible precisely because they are able to explain why Obligatory Control
(OC) and reflexivization have many of the properties they do. For example, we
typically find neither in the subject position of finite clauses (e.g. John
expects PRO to/*will win, John expects him(he)self to/*will win). Why not? Well
if the movement theory is right, then they are parts of A-chains and so should
pattern like what we find in analogous raising constructions (e.g. John was
expected t to/*would win), and they do. So the movement theory derives what is
largely stipulated in earlier accounts and exposes as systematic relations that
earlier theory treated as coincidental (that finite subject positions don’t
allow PRO, reflexives or A-traces). Does
his make such accounts true? Nope. But it does enhance their plausibility.
Thus, being able to unify these disparate phenomena and provide principled
explanations for the distribution of OC PRO and for the relative paucity of
nominative reflexives enhances their claims on truth.
Note that here plausibility hinges on (1) accepting that
prior accounts are roughly descriptively accurate (i.e. doing what decent
science always does; building on past work and insight) and (2) explaining
their stipulated features in a principled way. When a story has these two
features it moves from possible to plausible. Of course, demonstrating
plausibility is not trivial, and what some consider plausibility enhancing
others will find wanting. But that is as it should be. The point is not that
theorizing is dispositive (nothing is) but that it strives for goals different
from empirical coverage (and this is not intended to disparage the latter).
Let me out this another way. When one has a possible
explanation in hand it is time to start looking for evidence in its favor. In other words, rather
than looking for ways to reject the account one looks for reasons to accept it
as a serious one. Trying to falsify (i.e. rigorously test) a proposal has its
place, but so does looking for support. However, trying to falsify a possible
theory is premature. What one should test are the plausible ones, and that
means finding ways to elevate the possible to a higher epistemological plain;
the territory of the plausible. That’s what how-plausibly theory aims to do; find
the fit between something that is possible and what has come before and showing
that the new possible story is a fecund extension of the old. It is an
extension in that it covers much of the same territory. It is fecund in that it
improves on what came before. This kind of theorizing is also hard to pull off,
but like how-possible theory, it relies heavily on theoretical imagination.
Which brings us to how-actually investigations. This is
where the theory and the data really meet and where something that family
resembles falsification comes into play. Say we have a plausible theory, the
next step is to tease out ways of testing its central assumptions. This, no doubt, sounds obvious. But I would
beg to differ. Much of what goes on in my little area of linguistics fails to
test central postulates and largely concentrates on seeing how to fit current
theoretical conceptions to available data (e.g. how to apply a Probe-Goal
account to some configuration of agreement/case data). There is nothing wrong
with this, of course. But it is not quite “testing” the theory in the sense of
isolating its central premises/concepts and seeing how they fly. Let me give
you an example.
I personally know of very few critical tests driven by
thoughtful theorizing. But I do know of one: the Aoun/Choueiri account of
reconstruction effects (RE). The reasoning is as follows: If REs are
reflections of the copy theory of movement (as every good Minimalist believes)
then where there is no movement, there should be no reconstruction (notice
movement is a necessary, not sufficient condition for RE). There is no movement
from islands, therefore there should be no RE within islands. Aoun/Choueiri
then goes onto argue that resumption in Lebanese Arabic is a movement
dependency (Demirdache argued this first I believe) and argues that whereas REs
are available when an antecedent binds a resumptive outside an island, they
fail systematically to arise with resumptives inside islands. This argues for
two central conclusions: (i) that REs are indeed parasitic on movement and (ii)
that resumption is a movement dependency. This vindicates the copy theory, and
with it a central precept of MP.
For now forget about whether Aoun/Choueiri is right about
the facts.[5]
The important point here is the logic. The test is interesting because it very clearly implicates key
features of current theory: the copy theory of movement, islands as
restrictions on movements and REs as piggybacking on copies. These are three central features and the
argument if correct tests them. And this is interesting precisely because they
are central ideas in any MP style
account. Moreover, it is very clear how the premises bear on the testable
conclusion.[6]
They can be laid out (that’s where theory comes in BTW, in laying out the premises
and showing how together they have
certain testable consequences) and a prediction squeezed from them. Moreover,
the premises, as noted, are theoretically robust. The Copy Theory of Movement
is a core feature of MP architectures, locality as islands are central parts of
any reasonable GG theory of syntax. Hence if these pulled apart it would
indicate something seriously amiss with how we conceptualize the fundamentals
of FL/UG. And that is what makes the Aoun/Choueiri argument impressive.
Like I said, I personally know of only a couple of cases
like this. What makes it useful here is that it illustrates how to successfully
do how-actually theory (i.e. it is a paradigm case of how-actually theoretical
practice). Find consequences of core conceptions and use them to test the core
ideas. We all know that most of what we
believe today is likely wrong at least in
detail. Knowing this however, does not mean that we cannot test the core features
of our accounts. But this requires determining what is central (which requires
theoretical evaluation and judicious imagination) and figuring out how to tease
consequences from them (which requires analytical acumen). In testing a
proposal to see how-actual we need to lead from theory to data, and this means
thinking theoretically by respecting the deductive structure that makes a theory
the theory it is.
How possible, how plausible, how actual; three grades of
theoretical involvement. All are useful. All require attention to the deductive
structure of the core ideas that constitute theory. All start with these ideas
and move outwards towards the phenomena that, correctly used, can help us
refine and improve them. Right now, theoretical work is largely absent from the
discipline, at least the how possibly and how plausibly variety. Even the how
actually kind is far less common than commonly supposed.
[1]
MDC distinguishes “substantivilists” and “process” ontologists” wrt their
different understanding of mechanism. The difference appears to reside in
whether mechanisms comprise both “entities” and “activities” or whether
activities alone suffice. MDC takes it as obvious that reducing entities to
activities is hopeless (“As far as we know, there are no activities in
…biology…that are not activities of
entitites” (5)). I mention this for the discussion is redolent of the current
discussion on FoL (Idsardi and Raimy discussing Hale and Reiss) concerning
substance free phonology. It is curious that the same kind of discussion takes
place in a very different venue and so it is worth taking a look at it in this
domain to gain leverage on the one in ours.
[2]
As an old friend (Louise Antony) once remarked: in answer to a question like
“why did this book drop to the floor when I let go?” it is not helpful to
answer that “it always drops whenever anyone lets go.”
[3]
I am sure it is not news that the distinction is not accurately described in
terms of arguments and adjuncts. But for the record, the absence of pair-list
readings of WHs extracted from weak islands seems to show the same
acceptability profile as adjuncts even though the WH moves from the complement
position. The difference seems to be less argument/adjunct and more individual
variable vs higher level variable interpretation.
[4]
FWIW, I think that this is where maybe a minimality style explanation of the
Rizzi-Cinque variety might be a better fit than the Lasnik-Saito/Barriers
approach. But even this story need some detailed reworking.
[5]
There is some evidence from Jordanian Arabic contradicting it, though I am not
sure whether I believe it yet. Of course, you can take what I believe and still
need the full fare of about $2 to get a metro ride in DC.
[6]
I often find it surprising how few papers of a purported theoretical nature
actually set out their premises clearly and deduce the conclusion of interest.
More often, we sue theory like putty and smear it on our favorite empirical
findings to see if copiously applied it can be used to hold the data together. Though
this method can yield interesting results it is not theory driven and generally fails to address an identifiable
theoretical question.