In an indiscrete footnote in some earlier post I suggested
that linguistics, or at least generative grammar, or at least syntactic
“theory” had not yet escaped its philological roots. There are several marks of
this. The footnote mentioned the inordinate respect for data. I would not be surprised
if many readers saw this and either thought I was joking (me? Do you know me to
joke about such matters?) or nodded knowingly that such a cavalier attitude to
empirics is hardly surprising coming from me. However, I was not (entirely)
speaking tongue in cheek and would like to defend the proposition that despite
our best efforts, linguistics has yet to embody the true scientific spirit, one
that embraces theoretical insight as
one of the (maybe the) central goals
of inquiry. I will confine my remarks to syntax, but I believe that my qualms
extend beyond this particular bailiwick. To put my point bluntly: despite
prevailing assumptions, there is precious little theory in theoretical syntax
and this, I believe, is because we are still in the grip of the descriptivist
demon.
There are four kinds of research in the real sciences: (i)
phenomenological,
(ii) experimental, (iii) analytical/computational, and (iv)
theoretical. The first is roughly descriptive: this variable covaries with that
one in these circumstances (think pressure, volume and temperature). The second
aims to test theory (e.g. if there is an aether then given the earth’s spin
light will move measurably faster in this direction than that one). The third
is intimately related to the second. Oftentimes to get testable predictions
requires solving equations for special cases and this can demand a lot of
computational skill (think black holes and general relativity). The forth kind
of work is theoretical and it aims to explain why we find what we find and why
we don’t find what we don’t. Fundamental theory achieves this using simple
natural principles. What makes this enterprise especially demanding is that a
large part of what theory does is discover
what ‘simple’ and ‘natural’ mean in a given context of inquiry. ‘Simple and
natural’ are a bit like pornography: hard to specify antecedently but
recognizable (at least over time) when paraded.
Work in syntax has research that corresponds roughly to the
above four-part division. Thus, there are roughly four kinds of syntax research:
(a) Work that analyzes data in a given language. These are generally organized
around constructions such as relative clauses in X or ‘tough’ constructions in
Y or resumptive pronouns in Z. Here the targets of explanation are the
intricacies of specific constructions and theory is used to explain the
observed intricacies properties. (b) Research that explores given theoretical
constructs empirically. In practice, this is often pursued in tandem with (a), but the two are conceptually different.
Here the aim is to refine or “test” a theoretical point and data is used to
polish theoretical details or choose between competing alternatives. One argues
from data to theory, rather than from theory to data. Examples include work in
the mid 1980s investigating how exactly to state the ECP and subjacency
conditions, looking for the right definition of government and binding domains.
Other examples can be found in much current work trying to state exactly which
heads are phases, differentiate weak from strong phases, or what the right
statement of defective intervention should be. A particularly popular sport is
to try and identify the right sets of features the grammar manipulates. (c) Papers
that aim to unify disparate UG generalizations. This work has both an
analytical and empirical dimension, with the second following the first. The classical example of this is Chomsky’s
reanalysis of Ross’s Islands in terms of the theory of subjacency and later
extensions of this that aimed to incorporate Huang’s CED effects into a theory
of bounding in terms of barriers. More current examples include Chomsky’s
proposed unification of PS and movement rules in terms of Merge, Pesetsky and
Torrego’s analysis of fixed subject effects in terms of economy, the
unification of control and binding with movement and/or Agree, among others. (d)
Papers whose aim is to “simplify” and conceptually “motivate” the basic
machinery, e.g. Minimality in terms of search, the cycle in terms of Extension,
Phase Impenetrability in terms of Multiple Spell Out, and islands as
PF/linearization effects.
All of these kinds of
research are valuable and contribute to the answering the central questions in
linguistics. However, whereas the field values the first two unequivocally,
it has tended to be less welcoming to theoretical research of the (c)/(d)
variety above. Why so? Here are three speculations.
First, many linguists tend to confuse ‘theoretical’ and
‘formal.’ Since the beginning, Generative Grammarians, including syntacticians,
have had a proprietary technical formalism for describing linguistic phenomena.
Part of what it is to be a linguist involves mastering this technical formalism.
However, being formal does not entail being theoretical. Descriptive phonetics is quite formal but its
aims are not theoretical. Ditto for most of statistics, and, in my opinion,
semantics. Most theory is formal as formalism allows for the kind of
abstraction/idealization that theory lives on. But formal is not equivalent to
theoretical. Virtually all current work in linguistics is formal. Very little
is theoretical in the sense of (c)/(d) above.
Second, it fits with the widespread Empiricist belief that
science consists in the careful production and organization of facts. Theory, on this view, is effectively a way of
compactly summarizing the data. On this view, theory is useful, but epistemologically
secondary. The main enterprise is to discover, collect and organizing the facts.
This Empiricist conception contrasts with the Rationalist
one. For Rationalists, correct theory identifies the underlying causal powers
at work in generating the observed data. What theory identifies, therefore, is
metaphysically prior to the observed data as correct theory delineates the
basic causal ontology; the fundamental/basic operations/powers and primitives.
Thus, if the aim of science is to understand how things work then a science
succeeds to the degree that it can produce plausible theories. No theory, no
candidates for causal powers, no such candidates, no explanation. True, data is
part of (note, part of) how we
evaluate whether or not a theory is plausibly true (or even exactly true), but
a theory is not a compact summary of the (potential) data but a limning of the basic
metaphysical furniture.[1]
The Empiricist attitude fits well with another one, not
uncommon within linguistics, which luxuriates in the boundless variety that we
find within natural languages. The
theoretical impulse involves abstracting away from this diversity in order to
focus on the underlying invariant structures. To those impressed by the evident
variety of NL phenomena, such a theoretical attitude can appear to be
dismissive of the facts and hence non-scientific.
Third, until recently, there has not been much scope for
interesting theory. Theory lives when there are higher order generalizations to
unify and rationalize. One of the glories of Generative Grammar, IMO, has been
the discovery of a dozen or so of these kinds of non-trivial generalizations
that have a claim to be properties of FL/UG. Once these exist, theory can
emerge. Until these exist, whatever theory exists is so closely tied to the
data that separating empirical from theoretical speculation is
unproductive. Let me explain with a
simple example.
Chomsky’s unification of island phenomena presupposes that
Ross’s description of islands is more or less correct. Why? Because it is not worth unifying these
phenomena in a simpler more natural framework if the various generalizations
that Ross proposed are wrong. However, given
that Ross was right, we can step back from the most immediate empirical data
(e.g. acceptability judgments about island violations) and start investigating
the properties of these generalizations. Or, to put this another way: Ross
focused on first level acceptability data (e.g. is this sentence acceptable,
how does this long distance dependency compare with that one wrt acceptability)
in order to establish his generalizations. Chomsky, given Ross’s
accomplishments, could now focus on a second level question: what features do
islands have in common, or, why do we have the islands we have? Whereas Ross’s
investigation was largely empirically driven (i.e. we postulate the CNPC
because Who did you meet someone who saw
is judged lousy compared to Who did you
think that Bill said that you saw), Chomsky’s was not. For him the central
concern was mainly theoretical: to rationalize the diverse island phenomena in
terms of simpler more natural principles. And this was indeed the main
motivation for Chomsky’s theoretical efforts:
… the island constraints can be
explained in terms of general and quite reasonable computational properties
of formal grammar (i.e. subjacency, a property of cyclic rules that
states, in effect, that transformational rules have a restricted domain of
potential application; SSC, which states that only the most prominent phrase in
an embedded structure is accessible to rules relating it to phrases outside;
PIC, which stipulates that clauses are islands subject to the language specific
escape hatch..). If this conclusion can be sustained, it will be a significant
result, since such conditions as CNPC and the independent wh-island
constraint seem very curious and difficult to explain on other grounds. (p.
89; On WH Movement, my emphasis)
Of course, good theory leads, one hopes, to novel testable
predictions. In the case of the Theory of Subjacency, the novel prediction was
successive cyclicity, which, we discovered, had manifestations testable in
terms of acceptability data (e.g. Kayne and Pollock on stylistic inversion in
French, McCloskey and Chung on Complementizer agreement in Irish and Chamorro,
etc.). However, the value of theory lies not merely in extending the factual
base, but in introducing new explanatory desiderata, as Chomsky does in the
quote above.
One of the most important contributions of the Minimalist
Program, IMO, has been to emphasize the value of this kind of theoretical
research, research that is one or two removes from the immediate empirical
data. However, my read on what people are actually doing, suggests that this is
an idiosyncratic position. The bulk of MP research seems to me to fit largely
into categories (a) and (b) above. Moreover, much of it, IMO, could be just as
usefully have been pursued in a GBish idiom, its minimalism being largely
technical (i.e. lots of features and probes and goals and AGREE rather than
overt and covert movements). This does not mean that this work is bad or unimportant.
But, it is not particularly theoretical. It is pretty similar to older styles
of research, ones driven largely by concerns of covering the first order data.
Let me put this another way: what a lot of minimalism has
done is substitute one set of technicalia (GBish) for another set (Phases,
probes/goals, feature driven operations, etc.). The larger initial goals of the
program, to explain why we have the FL we do and not another, is often
invisible in the work being done. Papers
that even ask this why question are
pretty rare. Indeed about as rare as papers in GB that worried about whether
the parameters being proposed or the principles being endorsed really alleviated
the learnability problem. Let me be
clear here, or clearish: I am not saying that such questions are easy to
answer. If they were, we would not have a field. But, it is surprising to me
how rarely theory is even sensitive
to their relevance. The focal question of GB theory was the learnability issue.
The focal point of MP is the ‘why this FL/UG rather than other conceivable
ones’ question. All the other stuff is the necessary pre-requisite for
theoretical work. It is not the same as theoretical work.
So, why the dearth of theory even within the confines of MP?
Maybe because the opportunities for doing this style of work are relatively
recent (despite some very impressive precedents) and it takes a while for a
novel style to gain a foothold. After all, we all know how to do research of
the (a) and (b) type. Furthermore, we know how to teach it and test it and PhD
it and review it. Maybe work of the (c)/(d) variety is less common as we have
not yet found ways to impart its virtues. Maybe. But I suspect that work of the
(c)/(d) variety will prove to be a harder sell precisely because of its greater
inherent abstractness. The real message of MP is that you should not simply
look at what’s sitting before your eyes, and this has always proven to be a
tough sell, unfortunately.
Let me end by reiterating that all the varieties of research
noted above are valuable and important. Theorists cannot do their work
unless less theoretical work is pursued. However, theoretical work is different, and right now, it’s having
a tough time gaining traction. One of the initial hopes of MP was to refocus
activity in more theoretical directions. IMO, to date, this hope has not been especially
well realized.
In the context of theoretical vs. formal in linguistics, this quote from Jan Koster (here) sums it up, I think:
ReplyDelete"Many linguists immerse themselves in technical detail, which is necessary but always runs the risk that the field degenerates into a continuation of philology by other means."
Do we have any reason to think that in other sciences, pure theory is any less of a minority project? My impression is that the science departments I'm familiar with have many more people focussing on experiments in their labs rather than sitting back and thinking about what's behind it all. Perhaps, there is an optimal balance (somewhere in the 80% data, 20% theory region?)?
ReplyDeleteI don't know actually. However, I get a sense that in other sciences there is a better feel for the difference between "real" theoretical work and other kinds of work. In syntax, it seems that anyone who works on syntax is a theoretician. Does this make a difference? Perhaps. I think that within syntax theory needs some nurturing. It is only relatively recently that interesting theory has become possible. By treating everything that a syntactician does as theory makes it hard for this other different kind of practice to find a foothold. Note, to forestall revulsion: none of this is meant to imply that theory is better than the other stuff or that it is even more important. It is only meant to suggest that unless we understand the differences that we will avoid mooching them all together, to the detriment of theoretical work.
DeleteIt also depends on which (sub)field you're looking at. In physics, pure theory is indeed held in high regard and there is a split between theoreticians, which are pretty much mathematicians nowadays, and experimentalists that produce data and test theoretical predictions. But things are already muddled in chemistry, and biology is pretty much in the same situation as linguistics. Recently there has been some interesting theoretical work such as deriving the maximum possible size of animals from the properties of their metabolism and the environment they live in, but those are still outliers.
DeleteThere's also the issue that many people would consider all of mainstream syntax theoretical because it isn't applied. In computer science, for example, the split isn't between theory- and data-driven but between theoretical and applied. And since so little of linguistics is being recycled in computational linguistics, all of it is theoretical from that perspective.
"In syntax, it seems that anyone who works on syntax is a theoretician ... treating everything that a syntactician does as theory makes it hard for this other different kind of practice to find a foothold".
DeleteI think this is a very good point. The usage of the term "theoretical linguistics" seems quite unhelpful. It causes a related problem, I think, in psycholinguistics: there is a tendency to treat "theoretical linguistics" as disjoint from "psycholinguistics", which makes it difficult for an empirical/theoretical split within psycholinguistics to find a foothold.
There is some sense in which syntacticians/semanticists/etc. are doing "more abstract" work than psycholinguists are, and this probably underlies the tendency to label the former "theoretical", but I don't think the abstraction involved is the kind of abstraction that separates Norbert's (a)/(b) from his (c)/(d). It's perhaps a bit more like the relationship between physics and chemistry; the terminology we have at the moment in linguistics is analogous perhaps to calling chemistry "theoretical physics" because of the higher level of abstraction.
(There is also the separate question about whether it's useful to associate the prefix "psycho-" with the lower level of abstraction, which I think produces distinct confusions of its own.)
You're definitely correct that there is a split in popularity between research of types A and B versus types C and D. Two pieces of evidence:
ReplyDelete1) There are no attempts at pure unification. Chomsky's unification of Ross's island constraints, for example, isn't really a unification but rather a reanalysis that covers all the data and, crucially, makes new predictions that are empirically borne out. I think if Chomsky had just provided a technical unification of all known island constraints without new empirical insights this line of work wouldn't have been received quite as enthusiastically. Quite generally, whenever you're reading a paper that claims to unify various analyses, you are guaranteed to encounter a chapter that discusses new empirical predictions. So unification is not considered important by itself, what matters in the end is the empirical payoff.
2) If alternate accounts are compared, the result is publishable only if the accounts turn out to be distinct. I can't think of a single linguistics paper that concludes that two approaches are empirically equivalent, even if that means making rather ad hoc assumptions about how those approaches may be modified, extended, what the evaluation metric should be, etc. That's actually not surprising, because linguists are never taught that having equivalent theories is a good thing (at least I wasn't, and UCLA is a fairly theoretical place). Physics is full of empirically equivalent theories, and in mathematics equivalence theorems are some of the most useful. That's because such equivalences increase your understanding of the problem at hand --- having multiple perspectives is a good thing. Of course, if you care primarily about the empirical payoff, then such results are pretty much worthless.
In sum, if you want to publish theoretical work, make sure there is some empirical pay-off and your results aren't too abstract. Hmm, why does it feel like I've written something like this before? ;)
A rare exception to the absence of celebrated equivalence results is Robin Cooper & Terry Parsons' demo that a 'Generative Semantics' like and 'Intepretative Semantics' like treatment of quantifiers were equivalent. (in Partee (ed) 1976 _Montague Grammar_). This certainly made an impression on people in the early-mid 70s.
ReplyDelete