In the sciences, it takes a lot of work for a new idea to
take hold. Aside from a modicum of conceptual
clarity, a conceptual innovation must be operationalized,
and this requires offering canonical or paradigmatic models of its application.
I mention this because for me one of the recurring difficulties with the
Minimalist Program has been figuring out what makes any given proposal/analysis
minimalist (i.e. the path from Minimalist Program to Minimalist Theory is often
obscure). So, while being a minimalist
is all very nice (e.g. it greatly (minimally?) enhances my self-esteem), what
is more absent than it should be are clear examples of doing minimalism; examples of what makes a particular
analysis/proposal minimalist or how the abstract leading ideas get concretized
in everyday work. Fortunately, I have recently read some papers that, I
believe, can serve as parade cases of minimalist thinking and provide clear
examples of how the Strong Minimalist Thesis can be incarnated. They are the
topic of today’s sermon.
First, what’s the Strong Minimalist Thesis (SMT)? It is the claim that “language is an optimal
solution” to interface conditions:
…the human faculty of language FL
[is] an optimal solution to minimal design specifications, conditions that must
be satisfied for language to be usable at all…for each language L (a state of
FL), the expressions generated by L must be “legible” to systems that access these
objects at the interface between FL and external systems – external to FL,
internal to the person. (DbP 1).
The systems that L interfaces with use its generated objects. The SMT proposes that the objects of L
are well designed for the cognitive interfaces that use them in doing what they
do. Put another way, the generated objects can be used as is (i.e. without further alteration) to do what needs getting
done (viz. the information they contain need not be further repackaged for the
interfaces to use them for whatever tasks they set their “hands” to.)[1]
What the hell does this mean? Two papers
by Pietroski, Lidz, Halberda and Hunter (PLHH) (here and here) provide a useful
concrete model for interpreting these abstract claims. Their discussion centers
on the correct representation of the meaning of most. Here’s what PLHH do.
The papers are interested in figuring out how most sentences affect visual perception
(i.e. how the visual system uses grammatical information in making a visual
judgment). Specifically, how does someone who hears (1) judge whether a certain
presented array of dots verifies (1).
(1) Most
of the dots are blue
It’s a given that (1) is true iff the number of blue dots exceeds the number of non-blue dots. The
question is how does one represent the italicized information and does it
matter to what people do. The problem
becomes interesting in that there are several ways of representing the quantity
information in (1) that are not intensionally equivalent (viz. they use
different predicates and different relations) despite being truth functionally
the same (in Frege speak: they involve different routes to the same truth
value). Here are three possible representations for the meaning of most.
(2) a. OneToOnePlus*: [{x: D
(x)}, [x: Y
(x)}] iff some some set s, s Ì {X: D
b.
|{x: D
(x) & Y
(x)}| > {x: D
(x) & - Y
(x)}|
c. |{x: D (x) & Y
(x)}| > |{ x: D
(x)}| - |{x: D(x)
& Y
(x)}|
For D= ‘dot’ and Y= ‘blue,’ the (2a) representation carries out the
evaluation of the dot scene by pairing the blue with the non-blue dots and
seeing if there is at least one extra blue dot left over. The second, in (2b),
sees if the size of the set of blue dots is greater than the size of the set of
non-blue dots and the third, (2c) sees if the size of the set of blue dots is
greater than the size of the set of all the dots minus the set of blue
dots. (2a) differs from the others in
using a distinct predicate (viz. OnToOne) while (2b) and (2c) differ in the
sets are compared, the former directly calculating the set of non-blue dots,
the latter never directly numerically evaluating this set (i.e. it does so indirectly by directly subtracting the blue dots from the entire set of dots).
PLHH reason as follows: There are two possibilities when
speakers are asked to evaluate dot scenes on hearing (1): (i) the visual/counting system might find
some of these representations more congenial than the others. (ii) Or, the
visual/counting system might use any of these three truth functionally
equivalent representations given the right circumstances. If (i) holds then this visual-counting interface
favors one representation over the other two. Why? Because “linguistic meanings
are related the cognitive systems that are used to evaluate sentences for truth
and falsity.” More specifically: “a declarative sentence S is semantically
associated with a canonical procedure for determining whether S is true…[and]
competent speakers are biased towards strategies that directly reflect canonical specifications of truth conditions.”
They dub this thesis the Interface
Transparency Thesis (ITT). Put in slightly more “minimalist” terms: in a
well designed grammar, its products will supply the information the interfaces
need in a way transparent to those needs. More specifically, the interfaces
will “use” the kinds of information the linguistic structure directly encodes. Or, the information that the grammatical
representations encode and the information that the interface uses is one and
the same.
Before going on, observe that the ITT provides a useful (and,
as PLHH demonstrate, usable)
interpretation of “optimal solution to interface conditions”: for a given
interface (i.e. system that uses language) how transparent is the mapping
between the information made available from L and the information that the
interface uses to do what it does? SMT amounts
to the hypothesis that a strong transparency holds between the information as
coded in L and the information these various interfaces exploit to do what they
do. If considerable transparency holds then
SMT is vindicated. If not, not.[3]
So among other things, one very useful contribution of PLHH’s papers is that
they provide a substantive yet manageable interpretation of the SMT. But that
is not all.
I would not be going through all of this were it not the
case that PLHH demonstrate that not all representations of most are created equal. They
provide very good reasons to conclude that (2c) is the right semantic
representation of most (or, is
clearly superior to (2a,b)). Demonstrating this is conceptually simple but the
argument is quite complicated and rich. Showing (2c) is superior to (2a/b)
requires knowing a lot about properties of the interface, in this case knowing
how people count (humans use two counting systems with different properties)
and how people “count” what they see. Luckily,
this is a well-studied domain of visual perception (Halberda (one of the ‘H’s)
has done a lot of basic work on how humans do this) and so it is possible to
contrive visual dot scenes that would favor one or another of the
representational formats in (2) and see what happens. The answer is that the
information in (2c) is what humans compute, even
when things are visually arranged so that (2a) or (2b) would be simple to apply.[4]
The bottom line: humans have a bias for (2c) and the source of this bias is
reasonably attributed to the fact that the visual-counting system likes the
information as represented in (2c), as per the ITT.
Assume that this is correct. Can we go further and explain the properties of the
representation (2c) in terms of the properties of this interface? Let me be clear: PLHH show that one
representation is preferred to others. We can attribute this to the meaning of most being (2c) coupled with the
ITT. Given this we can ask the next
question: is the representational format of (2c) explicable in terms of the
properties of this interface? Recall, the SMT suggests that FL (a late emerging
system) is the “optimal solution” to interface requirements. This suggests that
the properties of FL are what they are because of the properties of the
interfaces that use them. PLHH show that
for some features of (2c) this
explanatory chit can be cashed in. Here’s their very interesting argument.
First, note that in (2b,c) different sets are being selected
for enumeration (recall (2c) says nothing direct about the non-blue dots). This said, it’s a fact that humans are very
good at selecting positive features in an array (e.g. blue dots or red dots or
green dots) but not at negatively specified features (e.g. not-blue dots). This
clearly argues against (2b) in that
one is directed to select the non-blue dots.
Second, it has been shown that subjects (human adults) “always attend and enumerate the superset
of all dots,” which is good news for (2c) as this is a required part of the
specified computation. Third, it can be shown that when subjects use the
Approximate Number System (ANS), the one used in this task, they can “estimate
the cardinality of up to three sets in parallel,” which means that if there are
blue dots, red dots, yellow dots, green dots and mauve dots that (2b) could not
be used to evaluate (1) in such a scene (i.e. the requirements in (2b) do not
scale up very well, whereas those in (2c) do).
In sum, as PLHH put it:
A meaning like [(2c)]…is
straightforwardly verified with these resources, since the sets required for
verification (one color plus the superset) are easily and automatically
attended by the visual system. Moreover, this meaning does not become less
plausible as the number of color subsets increases.
In other words, given the ITT in this domain (for which PLHH
have provided evidence) and given the properties of the ANS and the visual
system, representations like (2c) perfectly fit the structural capacities of
the interface. Thus, the meaning of most as specified in (2c) fits the noted
interface specifications to a (SM)T!
The work is gorgeous. But aside from its stand-alone value,
it’s really useful for minimalists to contemplate and absorb. The Interface Transparency Thesis provides a
useful concept for investigating how interfaces and grammars “fit.” If such a fit can be established, it is
possible (sometimes) to argue from properties of the interface to properties of
the representations. Minimalist should
understand and absorb this two-step tango for it serves to operationalize the
SMT, moving it from a frequently annoying slogan to a research problem,
something every minimalist should welcome.
Let me end by noting that PLHH are not alone in deploying
this argument. Berwick and Weinberg (BW) (here, where the notion ‘transparency’
was also mooted and discussed) develop an earlier version of this argument.[5]
Their version of the ITT considers another interface, the parser (i.e. those
interfaces that underlie parsing utterances in real time) and asks what
grammatical properties would allow for optimal parsing (roughly parsing in
linear time). BW showed that parsers with bounded left contexts would serve
nicely and argued that grammars that respected some version of cyclicity+subjacency
would perfectly fit the bill. So, if we assume that parsers use the structures
generated by L to parse then a cyclic+subjacent compliant grammar would be the
perfect fit. The form of argument is
exactly the same as that in PLHH, with the relevant interface this time being those
that underlie parsing.
So, the upshot: there are now some paradigm cases out there
of how to argue for the SMT. Deploying
these arguments requires knowing a lot about grammar and a lot about some interface property. However, as these two
cases show, such arguments can be made. Moreover, they can be made
convincingly. It seems that the SMT is
not merely a guiding regulative ideal but even one that can be empirically
evaluated. Pretty damn good!
The take home message?
One effective way of investigating the SMT is to identify some interface
system (the parser, the visual system, the ANS) and see how it uses the
grammatical information provided by L.
The SMT leads to the expectation that it uses the information
“transparently,” and that the details of how the interface works can explain
why the representation looks like it does. This is hard to pull off, for it
requires knowing a lot both about the
grammar and the interface at issue. It
suggests that future syntacticians will need to have new skill sets and/or be
very collaborative. This will no doubt
be demanding. But, hey, who every said that cognitive-biolinguistics would be
easy. The most anyone promised was that it would be fun, and, if these cases
are any indication, crammed with more than a touch of intellectual beauty as
well.
[1]
If I understand the notion “covering grammar” correctly, then one might say
that the competence grammar is the covering grammar for the relevant
interface. In the best case it is the
grammar that every interface uses.
[3]
Note the word ‘considerable.’ The
relevant evaluation will revolve around some estimation of the degree of transparency and this may be a
labile notion. It may be possible to
make these estimations on a case by case basis without having a general measure of transparency.
[4]
As PLHH note: humans can in fact apply the predicates in (2a) and (2b) in non-quantificational
tasks. Thus their failure to apply them in these “linguistic” contexts cannot
be traced to some general human incapacity to deploy them.
[5]
In addition Colin Phillips proposal that the grammar is identical to the parser
can be interpreted as postulating a very strong transparency assumption for
this interface.
This comment has been removed by a blog administrator.
ReplyDelete