So here’s my question: What’s the point of theta theory?
What does a theta role do? Here’s my impression: we want theta theory to do two
different kinds of things and it is not clear to me that any theory can (or should) do both. What are these two things? They
are an integral part of the semantic interpretation of a sentence and they are
the means by which arguments are linked to syntactic positions in
“D-structure.” I should point out that noting these dual desiderata is not original
to me, but arises from what I recall were earlier important discussions of
these matters by Dowty, Grimshaw, and others.
Nonetheless, I feel that these issues have become more obscure over time
and I would like to engage in a rambling re-think. This is all in the way of
excusing the shambolic nature of what follows. Hard as it is for me to present
clear arguments in general, in this case I am not even going to try. I just
want to sorta kinda survey the options and try to clear up my own confusion.
Needless to say, I am relying on the kindness of others to clear up the mess.
Here goes.
The literature seems to have two different (though possible
related, we shall see) desiderata for theta roles:
(1) Theta
roles are required for semantic interpretation
(2) Theta
roles are required to get the LAD from primary linguistic data to a G.
Let’s discuss each of these a little bit. The first view of
theta roles treats them as essential semantic notions. Without theta roles,
arguments would not have a semantic interpretation and given that Gs map
meanings and (“with,” if you are a thoroughly modern minimalist (TMM)) sounds
then we need some conception of meaning which is the target of the mapping and
theta roles are taken to be one component of a well-formed meaning.
The second view treats theta roles as levers for getting a language
acquisition device (aka: a child or LAD) from primary linguistic data (PLD) to
a G, most particularly, from PLD to a “D”-structure. The ‘D’ here is in scare
quotes here for as any TMM knows we have dispensed with D-structure in the GB
sense, yet, so far as I know, every theory of GG has some analogue thereof,
including current minimalist accounts. By ‘D-structure’ I just mean the G
structure in which arguments are grammatically linked up to their predicates
(or vice versa).[1]
A G establishes thematic links before establishing any further dependencies
that expressions grammatically enter into (e.g. agreement, case, binding, and
especially, movement). Fixing this first relation, the one where arguments join
predicates, is very important because it is very hard to study all the other
dependencies, especially movement, if you have no idea where expressions begin
their grammatical lives.
Is there a necessary relation between these two desiderata?
Perhaps, perhaps not (though I suspect not). Here’s what I mean. It might be
that the conception of theta role required for semantic interpretation is
identical to the one that used to get LADs from PLD to Gs. However, there is no obvious reason why this
need be the case. In particular, the conception of theta role required for semantic
interpretation seems to be at a different grain than the one useful from
priming the G pump. Let me explain.
One conception of theta role is simply as a place-holder for
the notion “argument.” For example, all we mean when we say that some DP has
the “agent” theta role is that it is the external argument of some
predicate. The designation “agent” does
not mean much, save indicating which of the ordered arguments of a predicate
some DP is related to.[2]
The problem with this conception is that it is not clear how
it helps with (2). In particular, it is quite unlikely that LADs know the
meanings of the predicates they are being exposed to and so it is not clear how
they could use this thin sense of theta role to acquire their G. Rather, what
we would like is some good coarse rule of thumb that the LAD can use to vault
into the G given some PLD. This is where notions like ‘agent’ and ‘patient’
gain their value. Being an agent or patient (a doer or done-to) is plausibly an
observational feature of an event
participant. In other words, the substantive interpretation of notions like
agent and patient plausibly have what Chomsky called “epistemological priority”
(EP). They are observable non-linguistic
predicates that can be used to map PLD to (non-observable) grammatical
dependencies. An example of such a useful mapping rule would be “Agents are
always external arguments, patients always internal arguments.” If every D-structure
corresponded to a set of (observable) theta roles with the right linking rules,
then we could solve the problem of how an LAD gets from PLD to abstract Gish
structures.
Now the problem is that it turns out to be hard come up with
such substantive thematic notions that are also plausibly semantically general.
Another way of saying this is that though there are plausibly some clear cases
of “agenthood,” it is not clear that the same
notion extends usefully to all (or even most) verbal subjects. Thus, though kickers
may be prototypical agents, lovers may not be. At any rate, one of the well-known
problems is that such substantive theta roles have a problem extending to all
predicates.
One important and influential solution to this is Dowty’s
work (here).
It defines super categories of theta roles, collapsing them into two
“proto”-flavors; Proto-Agent (P-A) and Proto-Patient (P-P). Proto roles are
defined over the full semantics of a predicate indexed to a particular argument
position. Thus, P-As are “verbal entailments about the argument in question,”
(i.e. those DPs that have more of the “agent” properties than any other DP in
that argument structure).[3]
On this conception, an argument is P-A if it has a preponderance of the
following properties in a given proposition: it is volitional, sentient, a causer
of events or changes of states in another participant, a mover, exists
independently of event named by verb ((27): 572). Indices of P-P are undergoing
a change of state, being an incremental theme, being casually affected by
another participant, being stationary relative to movement of another
participant, not existing independently of the named event ((28: 572). These are
among the contributing factors that
Dowty suggests for classifying arguments into one of the proto categories (he
is quite clear that these may not exhaust the relevant entailments). Note that
on this conception, proto roles are defined in terms of the more articulated
semantics of the sentence. In other words, given
the meaning of a sentence we can compute a coarser grained classification of
arguments into super categories that “average” over the differences. On this
conception, proto-roles “loose” information that the actual meaning of the
sentence contains.
Not surprisingly, for Dowty proto-roles do not determine
semantic interpretations for they presuppose them (i.e. proto-roles are defined
in terms of the entailments of the argument in question in the specific
proposition). Thus, on this view, proto-roles are not important for (1) above.
Their special function (if they are important at all, which is something that
Dowty often questions) is to provide an account of how arguments map to
syntactic positions given that we know the verbal implications of that argument
(i.e. what the proposition means).
IMO, the most interesting version of proto-role theory is
Baker’s UTAH version (see here).[4]
UTAH directly addresses the problem of how to get from pre-linguistic
information into the syntax. The idea is that proto-roles mediate the mapping
from PLD to “D-structure,” (e.g. P-As map to underlying subjects and P-Ps to
underlying objects). Thus, proto-roles are understood to enjoy epistemological
priority and are thus able to mediate a mapping to the linguistic system. What
is less clear is that Baker’s understanding of proto-roles is really the same
as Dowty’s. Why?
Well first, it seems unlikely, at least to me, that LADs
compute proto-roles for a given predicate to see how they map onto the syntax.
This presupposes that LADs have a rather rich understanding of the meaning of
each predicate prior to having any
linguistic analysis of the sentence. Some features of the “scene” may be
evident (e.g. on hearing “Fido is biting the ball” it is evident that Fido is
an “agent” and the ball a ‘patient’) but it seem to me unlikely that this is a
consequence of a computation over the meaning of “bite” indexed to the subject
and object positions. Rather, here the two notions are simple primitives
applying more or less (im)perfectly to the scene at hand. To get from PLD to G,
this kind of sloppy information may suffice (at least for a sufficient number
of verbs) but it is unlikely to be based on a prior full understanding of the predicates involved. Rather the
opposite. Of course, once the G is engaged, then there is more than theta
theory available to guide the LAD. So, for the linking problem, all that UTAH
must do is get the LAD into the G, then the G can offer other kinds of
linguistic information useful for acquiring the G of interest.
Second, Baker also assumes that the theta roles that solve
the linking problem are also inputs to the semantic interpretation of the
sentence. Note that this is very different from Dowty. For Dowty, proto-roles
are too coarse to provide a semantic interpretation. Baker’s suggestion that
theta roles are critical to meaning (rather than notions derived from the
meaning) assumes a different conception of linguistic meaning than Dowty’s
conception. It is unclear to me whether this conception has been fully
articulated.
There is a second influential view of theta roles, one that
aims to tie it more tightly to a natural semantics. This eschews proto-roles
and develops a more articulated inventory of thematic functions. So, here we get not just two or three roles
but a myriad of these. Agents, causers, experiencers, instruments, goals,
sources, beneficiaries, targets of emotion, etc. This richer conception allows theta structure
to explicate argument structure. Theta roles don’t just reflect meaning. They
determine it. Here, theta roles are cut thinly enough so that they can support
intuitive differences in the meanings of different predicates. Not all
agents/causers/experiencers are the same. We need hyphenated versions of these
to get the full range of mappings that all the different predicates in a
language manifest.
There are two main problems with this conception, I believe.
First, as Dowty argues quite persuasively, we really don’t have an even
approximately decent theory of what these richer roles are or how to specify
them. In particular, there are many many
verbs where it is quite unclear what the theta roles of the relevant arguments
is let alone how they differ. The most obvious cases involve symmetrical
predicates like ‘face’ (e.g. “Carnegie Hall faces the Carnegie Deli”) or
‘resemble’ (“Bill resembles Sam”). In such cases it is quite difficult to see
what thematic difference might distinguish one argument from the other. And
this problem generalizes. Why? Because there are many different ways of being
an agent and it is not at all clear that a hugger is an agent in the exact same
way that a lover is. But if these differences are semantically relevant, then
it appears that we will need about as many theta roles as we have predicates.
This is effectively Dowty’s point, but in the other direction. You can’t get from agents directly to huggers
as the concept is intended to abstract away from what makes huggers different
from lovers. But if you want to get all
the way to the actual semantic role that subjects of these particular
predicates play, then you will need a lot of hyphenated theta roles.
Second, it is not clear whether this conception will get you
any purchase on (2). Again, as Dowty notes, for this end we want a coarser
notion, one that will allow us to map arguments to syntactic structure in some general way. Cutting roles too
finely will not yield a simple mapping from roles to structure.
It is worth considering for a minute how these two
conceptions interact with the theta criterion. As Grimshaw, among others, noted
a long time ago, the theta criterion can make do with a very thin conception of
theta role. All it requires is that whatever
a theta role is a DP must get one and
no more than one of them. It does not matter how we distinguish roles, only
that we have some way of tying roles to syntactic positions. The prohibition amounts
to the claim that an argument must saturate some position and cannot saturate
more than one. So far as the theta criterion goes, we don’t really need a
general conception of theta role, only of something like “argument position.”
The theta criterion restricts arguments to one and only one of these.
A substantive theory of theta roles, one where the kinds of
theta roles we have matter, then only really arises with the linking problem.
Here we need theta roles that enjoy EP because grammatical notions are not
observables, and so to prime FL, to get us to Gs, we need some notions that can
bridge the G non-G divide (i.e. some observables that are (at least weakly)
correlated to Gish concepts).
Let me be a little clearer. Subject-hood and object-hood are
not observable except via an FL lens. Agent-hood and patient-hood likely are.
My (ex) dog Sampson could parse many scenes into agents and patients (doers and
done-tos), at least some of the time. If this is so (and I am certain that it
is) then these sorts of notions have EP status (they are not parasitic on FL
for their viability), and these notions can be used to prime FL via something
like UTAH (i.e. agents are subjects, patients are objects). UTAH uses the EP thematic
notions to access FL given some PLD. But,
if this is what one needs thematic notions for, then it is not at all clear
that every argument in every sentence need have a theta role. All that is
required is that enough PLD can be
parsed in this way to get the G system off the ground. Once the LAD has
accessed FL and started developing a G then these Gish notions can take
over/supplement the analysis of the PLD. In other words, theta roles as EPs
need not be very general (i.e. cover every conceivable predicate and argument),
they just need to be general enough to cover enough PLD predicates to prime FL
and get it going. Once FL is engaged then its resources are available for
further linguistic analysis. And for this purpose, these notions can be
(actually should be) quite coarse as their aim is not to provide an
interpretation for the sentence but to just crack open the FL module and make
it usable by the LAD, which, when on-line, is then able to provide (more)
grammatical ways of analyzing the incoming PLD (e.g. this agrees with that so
this is a subject, this is adjacent to the verb so this is the object, etc.).
One might go a step further here, I think. To solve the
linking problem you want coarse roles that are not determined by calculating the verbal inferences of an argument.
Why? Because this is just too fancy a procedure. You want very coarse
indicators, those that Sampson could (and did) use. The problem with
proto-roles as understood by Dowty is that they don’t seem to be EPish. They
are not so much observables as inferables. What I mean is that to get
proto-roles the LAD would need to compute inferences off of pretty
sophisticated semantic representations. And these need not be very accessible.
Better to have limited coarse-grained properties that fit a small number of
available predicates than to have a sophisticated system that generalizes
across all predicates. You just don’t need the latter if what you want to do is
solve the linking problem.
I’ve rambled on long enough and repeated myself way too much
(as if repetition and clarity go hand in hand!). Here is what appears to be the
main conclusion: we seem to have been asking theta roles to do two things that
don’t obviously pull in the same direction. We want them to provide an
interpretation for the sentence and to solve the linking problem. However, the
kind of roles we want for the first appear to be different from the kinds of
roles we need for the second. IMO, the linking problem is the important one for
GG. But if this is right, then having a theory of roles that applies to every DP in every sentence is unnecessary (or at least not obviously required).
We need a few gross observational roles that apply to enough PLD predicates to
get a G up and running. Once engaged, an LAD gets immediate access to a whole
slew of linguistic features that the LAD can effectively use to continue
acquiring its G. One this conception, we just don’t need a general theory of
theta roles (i.e.one that assigns each argument an interpretive role). Which
seems like a good thing given that one does not appear to be currently
available or likely to be forthcoming.
[1]
Everyone (including advocates of the movement theory of control, e.g. me)
assumes that at least the following is accurate: every (contentful (i.e. non
pleonastic)) DP enters the derivation through a thematic door. Thus the first
relation that any such DP grammatically enters into is a thematic relation.
This is also true of every version of minimalism that I am aware of.
[2]
Even for neo-Davidsonians like Scheine and Pietroski where theta roles serve an
important type-lifting role (they are relational predicates that tie a DP to an
event variable), all that is generally required is a distinction between
internal vs external argument. What flavor these are (whether they are agents
or experiencers or causes or…) does not really matter much. The same is even
truer for standard conceptions where arguments are effectively related to their
predicates by saturating a variable position of the predicate via lambda
conversion.
[3]
Arguments actually, for it not defined syntactically but over the propositional
structure. We can say that a DP has the proto-role in virtue of representing
the relevant argument. I will leave such niceties aside here.
[4]
This is an online version of the paper that appeared in Haegeman’s edited
volume Elements of Grammar. It is a
great paper. One of those that I wish that I had written.