This is the final part of my comments on lecture 3. The first three parts are (here,
here
and here).
I depart from explication mode in these last comments and turn instead to a
critical evaluation of what I take to be Chomsky’s main line of argument (and
it is NOT empirical). His approach to
labels emerges directly from his conception of the basic operation Merge. How
so? Well, there are only two “places” that MPish approaches can look to in
order to ground linguistic processes, the computational system (CS) or the
interface conditions (Bare Output Conditions (BOC)). Given Chomsky’s conceptually spare
understanding of Merge, it is not surprising that labeling must be understood
as a BOC. I here endorse this logic and conclude that Chomsky’s modus ponens is my modus tolens. If correct, this requires us to rethink the basic
operation. Here’s what I believe we should be aiming for: a conception that
traces the kind of recursion we find in FL to labeling. In other words, labeling
is not a BOC but intrinsic to CS; indeed the very operation that allows for the
construction of SLOs. Thus, just as Merge now (though not in MPs early days)
includes both phrase building and movement, the basic operation, when properly conceptualized, should
also include labeling.
To motivate you to aim for such a conception, it’s worth
recalling that in early MP it was considered conceptually obvious that Move and
Merge were different kinds of things and that the latter was more basic and
that the former was an “imperfection.” Some (including me and Chris Collins)
did not buy this dichotomy suggesting that whatever process produced phrase
structure (now E-Merge) should also suffice to give one move (now I-merge). In
other words, that FL when it arose came fully equipped with both merge and move neither being more
basic than the other. On this view, Move is not an “imperfection” at all.
Chomsky’s later work endorsed this conception. He derived the same conclusion
form other (arguably (though I’m not sure I
would so argue) simpler) premises. I derive one moral from this little history:
what looks to be conceptually obvious is a lot clearer after the fact than ex
ante. Here’s a good place to mention the owl of Minerva, but I will refrain. Thus,
here’s a project: rethink the basic operation in CS so that labels are
intrinsic consequences. I will suggest one way of doing this below, but it is
only a suggestion. What I think there are arguments for is that Chomsky’s way
of including labels in FL is very problematic (this is as close as I come to
saying that it’s wrong!) and
misdiagnoses the relevant issues. The logic is terrific, it just starts from
the wrong place. Here goes.
1. The
logic revisited and another perspective on “merge”
There are probably other issues to address if one wants to
pursue Chomsky’s proposal. IMO, right now his basic idea, though suggestive, is
not that well articulated. There are many technical and empirical issues that
need to be ironed out. However, I doubt that this will deter those convinced by
Chomsky’s conceptual arguments. So before ending I want to discuss them. And I
want to make two points: first that I think that there is something right about his argument. What I mean is
that if you buy Chomsky’s conception
of Merge, then adding something like a labeling algorithm in CS is conceptually
inelegant, if not worse. In other words, Chomsky is right in thinking that
adding labeling to his conception of
Merge is not a good theoretical move conceptually. And second, I want to suggest
that Chomsky’s idea that projection is effectively an interface requirement, a
BOC in older terminology, has things backwards.
The interfaces do not require labeled structures to do what they do. At
least CI doesn’t, so far as I can tell. The syntax needs them. The interfaces
do not. The two points together point to the conclusion that we need to
re-think Merge. I will very briefly suggest how we might do this.
Let’s start. First,
Chomsky is making exactly the right kind of argument. As noted at the outset,
Chomsky is right to question labeling as part of CS given his view that Merge
is the minimal syntactic operation. His version of Merge provides unboundedly
many SLOs (plus movement) all by itself. One can add projection (i.e. labeling)
considerations to the rule but this addition will necessarily go beyond the conceptual minimum. Thus,
Merge cannot have a labeling sub-part (as earlier versions of Merge did). In fact, the only theoretical place for labels is the interface as the only
place for anything in an MP-style account is as an interface BOC or the CS. But
as labels cannot be part of CS, they must be traced to properties of the CI/SM
interface. And given Chomsky’s view that the CI interface is really where all
the action is, this means that labeling is primarily required for CI
interpretation. That’s the logic and it
strikes me as a very very nice argument.
Let me also add, before I pick at some of the premises of
Chomsky’s argument, that lecture 3 once again illustrates what minimalist theorizing
should aim for: the derivation of
deep properties of FL from simple assumptions.
Lecture 3 continues the agenda from lecture 2 by aiming to explain three
prominent effects: successive cyclicity, FSCs and EPP effects. As I have stressed before in other places,
these discovered effects are the glory of GG and we should evaluate any
theoretical proposal by how well and how many it can explain. Indeed, that’s
what theory does in any scientific discipline. In linguistics theory should explain
the myriad effects we have discovered over the last 60 years of GG research. In
sum, not surprisingly, and even though I am going to disagree with Chomsky’s
proposal, I think that lecture 3 offers an excellent model of what theorists
should be doing.
So which premise don’t I like? I am very unconvinced that labels reflect
BOCs. I do not see why CI, for example, needs labeled structures to interpret
SLOs. What is needed is structured
objects (to provide compositional structure) but I don’t see that it needs labeled SLOs. The primitives in standard accounts of
semantic interpretation are things like arguments, predicates, events,
proposition, operator, variable, scope, etc. Not agreeing phrases, VPs or vPs
or Question Ps etc. Thus, for example,
though we need to identify the Q operator in questions to give the structure a
question “meaning” and we need to determine the scope of this operator
(something like its CC domain), it is not clear to me that we also need to
identify a question phrase or an agreement phrase. At least in the standard semantic accounts I
am familiar with, be it Heim and Kratzer or Neo-Davidsonian, we don’t really
need to know anything about the labels to interpret SLOs at CI. It’s the
branching that matters, not what labels sit on the nodes.[1]
I know little about SM (I grew up in a philo dept and have
never taken a phonology course (though some of my best friends are
phonologists)), but from what I can gather the same seems true on the SM side.
There are difference between stress in some Ns and Vs but at the higher levels,
the relevant units are XPs not DPs vs VPs vs TP vs CPs etc. Indeed the general
procedure in getting to phrasal phonology involves erasing headedness information. In other words, the phonology does
not seem to care about labels beyond the N vs V level (i.e. the level of
phonological atoms).
If this impression is accurate (and Chomsky asserts but does
not illustrate why he thinks that the interfaces should care about labeled
SLOs) then we can treat Chomsky’s proposal as a reductio: He is right about how
the pieces must fit together given his starting assumptions, but they imply
something clearly false (that labels are necessary for interface legibility)
therefore there must be something wrong with Chomsky’s starting point, viz. that
Merge as he understands it is the right basic operation.
I would go further. If labeling is largely irrelevant for
interface interpretation (and so cannot be traced to BOCs) then labeling must
be part of CS and this means that Chomsky’s conception of Merge needs
reconsideration.[2]
So let’s do that.[3]
What follows relies on some work I did (here).
I apologize for the self-referential nature of what follows, but hey it’s the
end of a very long post.
Here’s the idea: the basic CS operation consists of two
parts, only one of which is language specific. The “unbounded” part is the
product of a capacity for producing unboundedly big flat structures that is not peculiarly linguistic or unique to
humans. Call this operation Iteration.
Birds (and mice and bats and whales) do it with songs. Ants do it with path
integration. Iteration allows for the
production of “beads on a string” kinds of structures and there is no limit in
principle to how long/big these structures can be.
The distinctive feature of Iteration is that it could care
less about bracketing. Consider an example: Addition can iterate. ((a+b)+c)+d)
is the same as (a+(b+c+d)) which is the same as (a+b+c+d) etc. Brackets in iterative structures make no
difference. The same is true in path
integration. What the ant does is add up all the information but the adding up
needs no particular bracketing to succeed. So if the ant goes 2 ft N and then 3
ft W and then 6 feet south and then 4 ft E, it makes no difference to
calculation how these bits of directional information are added together. However
you do this provides the same result. Bracketing does not matter. The same is
true for simple conjunction: ((a&b)&c)&d) is equivalent to (a &
(b & (c&d))) which is the same as (a&b&c&d). Again brackets
don’t matter. Let’s assume then that iterative procedures do not bracket. So there are two basic features of Iteration: (i)
there is no upper bound to the objects it can produce (i.e. there is no upper
bound on the length of the beaded string), and (ii) bracketing is irrelevant,
viz. Iteration does not bracket. It’s just like beads on a string.
Here’s a little model. Assume that we treat the basic
Iterative operation as the set union operation. And assume that the capacity to
iterate involves being able to map atoms (but only atoms to their unit sets (e.g. a--> {a}). Let’s call this Select. Select is an operation whose
domain is the lexical atoms and whose range is the unit set of that atom. Then
given a lexicon we can get arbitrarily big sets using U and Select.[4]
For example: If ‘a’, ‘b’ and ‘c’ are atoms, then we can form {a} U {b} U {c} to
give us {a,b,c}. And so forth. Arbitrarily big unstructured sets.
Clearly, what we have in FL cannot just be Iteration (ie. U
plus Select). After all we get SLOs. Question: what if added to Iteration would
yield SLOs? I suggest the capacity to Select the outputs of Iteration. More particularly, let’s assume the little
model above. How might we get structured sets? By allowing the output to Iteration to be the input to Select. So,
if {a,b} has been formed (viz. {a} U {b}-> {a,b}) and Select
applies to {a,b} then out comes the structured SLO {{a,b}, c} (viz. {{a,b}} U
{c} -> {{a,b},c}. One can also get an analogue of I-merge: select {{a,b},c} (i.e.
{{{a,b},c}}, select c (i.e. {c}), Union the sets (i.e. {{{a,b},c}}} U {c}) and
out comes {c, {{a,b},c}}. So if we can
extend the domain of Select to
include outputs of the union operation then we can get use Iteration to deliver
unboundedly many SLOs.
The important question then is what licenses extending the domain of Select to the outputs of
Union? Labeling. Labeling is just the
name we give for closing Iteration in the domain of the lexical atoms.[5] In effect, labels are how we create
equivalence classes of expressions based on the basic atomic inventory. Another
way of saying this is that Labeling maps a “complex” set {a,b}to either a or b,
thereby putting it in the equivalence class of ‘a’ or ‘b’. If Labels allow Select
to apply to anything in the equivalence
class of ‘a’ (and not just to ‘a’ alone), we can derive SLOs via Iteration.[6]
Ok, on this view, what’s the “miracle”? For Chomsky, the
miracle is Merge. On the view above, the miracle is Label, the operation that
closes Iteration in the domain of the lexical atoms. Label effectively maps any
complex set into the equivalence class of one of its members (creating a
modular structure) and then treats these as syntactically indistinguishable
from the elements that head them (as happens in modular arithmetic (i.e. ‘1’
and ‘13’ and ‘12’ and ‘24’ are computationally identical in clock arithmetic).
Effectively the lexicon serves as the modulus with labels mapping complexes of
atoms to single atoms bringing them within the purview of Select.[7]
Note that this “story” presupposes that Iteration pre-exists
the capacity to generate SLOs. The U(nion) operation is cognitively general as is
Select which allows U to form arbitrarily large unstructured objects. Thus, Iteration
is not species specific (which is why
birds, ants and whales can do it). What is species specific is Label, the
operation that closes U in the domain of the lexical atoms and this what leads
to a modular combinatoric system (viz. allows an operation defined over lexical
atoms to also operate over non-atomic structures). Note that if this is right, then labels are intrinsic to CS; without it there are no
SLOs for without it U, the sole combination operation, cannot derive sets
embedded within sets (i.e. hierarchy).
The toy account above has other pleasant features. For
example, the operation that combines things is the very general U operation.
There are few conceivably simpler operations.
The products of U produce objects that necessarily obey NTC,
Inclusiveness and produce copies under “I-merge.” Indeed, this proposal treats
U as the main combinatoric operation (the operation that constructs sets
containing more than one member). And if combination is effectively U, then
phrases must be sets (i.e. U is a set
theoretic operation so the objects it applies to must be sets). And that’s why the products of this combinatoric
operation respect the NTC, Inclusiveness and produce “copies.”[8]
Let’s now get back to the main point: on this
reconstruction, hierarchical recursion is the product of Iterate plus Label. To
be “mergeable” you need a label for only then are you in the range of Select
and U. So, labels are a big deal and intrinsic
to CS. Moreover, this makes labeling facts CS facts, not BOC facts.
This is not the place to argue that this conception is
superior to Chomsky’s. My only point is that if my reservations above about
treating Labels as BOCs is correct, then we need to find a way of understanding
labels as intrinsic to the syntax, which in turn requires reanalyzing the
minimal basic operation (i.e. rethinking the “miracle”).
IMO, the situation regarding projection is not unlike what
took place when minimalists rethought the Merge/Move distinction central to
early Minimalism (see the “Black Book”).
Movement was taken to be an “imperfection.” Rethinking the basic
operation allowed for the unification of E and I-merge (i.e. Gs with SLOs would
also have displacement). I think we need to do the same thing for Labeling. We
need to find a way to make labels intrinsic
features of SLOs, labels being necessary for building structure and displacing
them. Chomsky’s views on projection
don’t do this. They start from the assumption that Labels are BOCs. If this
strikes you as unconvincing as it does me, then we need to rethink the basic
minimal operation.
That’s it. These comments are way too long. But that’s what
happens when you try and think about what Chomsky is up to. Agree or not, it’s
endlessly fascinating.
[1]
Edwin Williams once noted that syntactic categories cross cut semantic ones.
Predicative nominals have the same syntactic structure as argument nominal,
though they differ a lot semantically. I think Edwin’s point is more generally
correct. And if it is, then syntactic labels contribute very little (if
anything) to CI interpretation.
[2]
Though I won’t go into this here, there is plenty of apparent evidence that Gs
care about labeled SLOs. So languages target different categories for movement
and deletion. Moreover there are structure preservation principles that need
explaining: XPs move to Max P positions, X’s don’t move and heads target
heads. In a non-labeling theory, it is still
unclear why phrases move at all. And the Pied Piping mantra is getting a bit
thin after 20 years. So, not only is
there little evidence that the interfaces care about labels, there is
non-negligible evidence that CS does. If correct, this strengthens the argument
against Chomsky’s approach to projection.
[3]
One more aside: I am always wary of explanations that concentrate in interface
requirements. We know next to nothing about the interfaces, especially CI, so
stories that build on these requirements always seem to me to have a “just so”
character. So, though the logic Chomsky
deploys is fine, the premise he needs about BOCs will have little independent
motivation. This does not make the claims wrong, but it does make the arguments
weak.
[4]
If we distinguish selections from the “lexicon” so that two selections of a are distinguished (a vs a’),
we can get unboundedly big sets. Bags can be substituted for sets if you don’t
like distinguishing different selections of atoms.
[5]
Chomsky flirted with this idea in his earlier discussion of “edge features”
(EF). As yourself where EFs came from? They were taken as endemic to lexical
atoms. It is natural to assume that complexes of such atoms inherited EFs from
their atomic parts. Sound familiar? EFs,
Labels? Hmm. The cognoscenti know that
Chomsky abandoned this way of looking at things. This is an attempt to revive
this idea by putting it on what might be a more principled basis.
[7]
Here’s how Wikipedia describes the process:
In mathematics, modular arithmetic is a
system of arithmetic for integers,
where numbers "wrap around" upon reaching a certain value—the modulus.
The modern approach to modular arithmetic was developed by Carl Friedrich Gauss in his book Disquisitiones Arithmeticae, published in 1801.
A familiar use of modular arithmetic is in the 12-hour clock, in which the day is divided
into two 12-hour periods. If the time is 7:00 now, then 8 hours later it will
be 3:00. Usual addition would suggest that the later time should be
7 + 8 = 15, but this is not the answer because clock time
"wraps around" every 12 hours; in 12-hour time, there is no "15
o'clock". Likewise, if the clock starts at 12:00 (noon) and 21 hours
elapse, then the time will be 9:00 the next day, rather than 33:00. Since the
hour number starts over after it reaches 12, this is arithmetic modulo
12. 12 is congruent not only to 12 itself, but also to 0, so the time called
"12:00" could also be called "0:00", since 12 is congruent
to 0 modulo 12.
[8]
Note, that Labels allow one to dispense with Probe/Goal architectures as heads
are now visible in “Spec-head” configurations. Not that there is anything
“special” about Specs (as opposed to complements or anything else). It’s just
that given labels, XPs can combine with YPs even after “first” merge and still
allow their heads to “see” each other. This, in fact, is what endocentricity
was made to do: put expressions that are not simple heads “next to” each other.
And they will be adjacent whether the elements combined are complements or
specifiers. Chomsky is right that there is nothing “special” about specifiers.
But that’s just as true of complements.
I find this very interesting. I agree with you that it is problematic to say that labels like N and V are needed for the interfaces. They seem so clearly syntactic.
ReplyDeleteOn the phonological side, I suppose most phonologists would agree that the primary candidates for syntactic features that are visible are N and V. Jennifer Smith (at University of North Carolina) has many publications about phonological differences between nouns and verbs in many languages.
However, personally I am not convinced that even those features are really visible. I would think that the difference is merely one in structure. Take the English stress example. It seems to me that we could say that the verb record always has some phonetically empty syllabic ending which is lacking from the noun record. In other words, the relevant structures are /re-cord-0/ vs. /re-cord/. Stress is always on the penultimate syllable, and phonology does not have to see nouns or verbs. It only has to see that verbs typically have more (morphosyntactic) structure.
The question obviously still remains what ARE N and V? Why does syntax deal with these things rather than something completely different?