The discussion of the SMT posts has gotten more abstract
than I hoped. The aim of the first post discussing the results by Pietroski,
Lidz, Halberda and Hunter was to bring the SMT down to earth a little and
concretize its interpretation in the context of particular linguistic investigations. PLHH investigate the following: there are
many ways to represent the meaning of most,
all of which are truth functionally equivalent. Given this, are the representations empirically equivalent or are
there grounds for arguing choosing one representation over the others. PLHH
propose to get a handle on this by investigating how these representations are used by the ANS+visual system in
evaluating dot scenes wrt statements like most
of the dots are blue. They discover that the ANS+visual system always uses
one of three possible representations to evaluate these scenes even when use of the others would be both
doable and very effective in that context. When one further queries the
core computational predilections of the ANS+visual system it turns out that the
predicates that it computes easily coincide with those that the “correct”
representation makes available. The conclusion is that the one of the three
representations is actually superior to the others qua linguistic representation of the meaning of most, i.e. it is the linguistic meaning of most. This all fits rather well with the SMT. Why?
Because the SMT postulates that one way of empirically evaluating candidate
representations is with regard to their fit
with the interfaces (ANS+visual) that use it. In other words, the SMT bids us
look to how grammars fit with interfaces and, as PLHH show, if one understands
‘fit’ to mean ‘be transparent with’ then one meaning trumps the others when we
consider how the candidates interact with the ANS+visual system.
It is important to note that things need not have turned out
this way empirically. It could have been the case that despite core capacities
of the ANS+visual system the evaluation procedure the interface used when evaluating
most sentences was highly context
dependent, i.e. in some cases it used the one-to-one strategy, in others the
‘|dots ∩ blue| - |dots ∩ not-blue|’ strategy and sometimes the ‘|dots ∩ blue| - [|dots| - |dots ∩ blue|]’ strategy. But, and this is important,
this did not happen. In all cases the
interface exclusively used the third option, the one that fit very snugly with
the basic operations of the ANS+visual system. In other words, the
representation used is the one that the SMT (interpreted as the Interface
Transparency Thesis) implicates. Score one for the SMT.
Note that the argument puts together various strands: it
relies on specific knowledge on how the ANS+visual system functions. It relies
on specific proposals for the meaning of most
and given these it investigates what happens when we put them together. The kicker is that if we assume that the
relation between the linguistic representation and what the ANS+visual system
uses to evaluate dot scenes is “transparent” then we are able to predict[1]
which of the three candidate representations will in fact be used in a
linguistic+ANS+visual task (i.e. the task of evaluating a dot scene for a given
most sentence[2]).[3]
The upshot: we are able to use information from how the
interface behaves to determine a property of a linguistic representation. Read that again slowly: PLHH argue that
understanding how these tasks are accomplished provides evidence for what the
linguistic meanings are (viz. what the correct representations of the meanings
are). In other words, experiments like this bear on the nature of linguistic
representations and a crucial assumption in tying the whole beautiful package
together is the SMT interpreted along the lines of the ITT.
As I mentioned in the first post on the SMT and Minimalism
(here), this is not the only exemplar of the SMT/ITT in action. Consider one
more, this time concentrating on work by Colin Phillips (here). As previously
noted (here), there are methods for tracking the online activities of parsers.
So, for example, the Filled Gap Effect (FGE) tracks the time course of mapping
a string of words into structured representations. Question: what rules do parsers use in doing
this. The SMT/ITT answer is that parsers use the “competence” grammars that
linguists with their methods investigate. Colin tests this by considering a very complex instance: gaps within
complex subjects. Let’s review the argument.
First some background.
Crain and Fodor (1985) and Stowe (1986) discovered that the online
process of relating a “filler” to its “gap” (e.g. in trying to assign a Wh a
theta role by linking it to its theta assigning predicate) is very eager. Parsers try to shove wayward Whs into
positions even if filled by another DP.
This eagerness shows up behaviorally as slowdowns in reading times when
the parser discovers a DP already homesteading in the thematic position it
wants to shove the un-theta marked DP into. Thus in (1a) (in contrast to (1b),
there is a clear and measurable slowdown in reading times at Bill because it is a place that the who could have received a theta role.
(1) a.
Who did you tell Bill about
b.
Who did you tell about Bill
Thus, given the parser’s eagerness, the FGE becomes a probe
for detecting linguistic structure built online. A natural question is where do
FGEs appear? In other words, do they “respect” conditions that “competence”
grammars code? BTW, all I mean by
‘competence grammars’ are those things that linguists have proposed using their
typical methods (one’s that some Platonists seem to consider the only valid
windows into grammatical structure!)? The
answer appears to be they do. Colin reviews the literature and I refer you to
his discussion.[4] How do FGEs show that parsers respect
grammatical structure? Well, they seem not
to apply within islands! In other words, parsers do not attempt to related Whs to gaps within islands. Why? Well given
the SMT/ITT it is because Whs could not have moved from positions wihin islands
and so they are not potential theta
marking sites for the Whs that the parser is eagerly trying to theta mark. In
other words, given the SMT/ITT we expect parser eagerness (viz. the FGE) to be
sensitive to the structure of grammatical representations, and it seems that it
is.
Observe again, that this is not a logical necessity. There
is no a priori reason why the grammars
that parsers use should have the properties that linguists have postulated,
unless one adopts the SMT/ITT that is. But let’s go on discussing Colin’s paper
for it gets a whole lot more subtle than this. It’s not just gross properties
of grammars that parsers are sensitive to, as we shall presently see.
Colin consider gaps within two kinds of complex subjects.
Both prevent direct extraction of a Wh (2a/3a), however, sentences like (2b)
license parasitic gaps while those like (3b) do not:
(2) a.
*What1 did the attempt to repair t1 ultimately damage the
car
b.
What1 did the attempt to repair t1 ultimately damage t1
(3) a.
*What1 did the reporter that criticized t1 eventually
praise the war
b. *What did the reporter that criticized
t1 eventually praise t1
So the grammar allows gaps related to extracted Whs in (2b)
but not (3b), but only if this is a parasitic gap. This is a very subtle set of grammatical
facts. What is amazing (in my view
nothing short of unbelievable) is that the parser respects these parasitic gap licensing conditions. Thus, what Colin shows is that we find FGEs at
the italicized expressions in (4a) but not (4b):
(3) a.
What1 did the attempt to repair the
car ultimately …
b.
What1 did the reporter that
criticized the war eventually …
This is a case where the parser is really tightly cleaving to distinctions that the grammar makes. It
seems that the parser codes for the possibility of a parasitic gap while
processing the sentence in real time.
Again, this argues for a very transparent relation between the
“competence” grammar and the parsing grammar, just as the SMT/ITT would
require.
I urge the interested to read Colin’s article in full. What
I want to stress here is that this is another concrete illustration of the
SMT. If
grammatical representations are optimal realizations of interface conditions
then the parser should respect the distinctions that grammatical
representations make. Colin presents evidence that it does, and does so very
subtly. If linguistic representations are used
by interfaces, then we expect to find this kind of correlation. Again, it is
not clear to me why this should be true given certain widely bruited Platonic
conceptions. Unless it is precisely these
representations that are used by the parser, why should the parser respect its
dicta? There is no problem understanding
how this could be true given a standard mentalist conception of grammars. And
given the SMT/ITT we expect it to be true. That we find evidence in its favor
strengthens this package of assumptions.
There are other possible illustrations of the SMT/ITT. We should develop a sense of delight at
finding these kind of data. As Colin’s stuff shows, the data is very complex
and, in my view, quite surprising, just like PLHH’s stuff. In addition, they
can act as concrete illustrations of how to understand the SMT in terms of
Interface Transparency. An added bonus
is that they stand as a challenge to certain kinds of Platonist conceptions, I
believe. Bluntly: either these
representations are cognitively available or we cannot explain why the
ANS+visual system and the parser act as if they were. If Platonic
representations are cognitively (and neurally, see note 3) available, then they
are not different from what mentalists have taken to be the objects of study
all along. If from a Platonist perspective they are not cognitively (and
neurally) available then Platonists and mentalists are studying different
things and, if so, they are engaged in parallel rather than competing
investigations. In either case, mentalists need take heed of Platonist results
exactly to the degree that they can be reinterpreted mentalistically.
Fortunately, many (all?) of their results can be so interpreted. However, where this is not possible, they would be of absolutely no interest to the project of describing linguistic competence. Just metaphysical curiosities
for the ontologically besotted.
[1]
Recall, as discussed here, ‘predict’ does not mean ‘explain.’
[2]
Remember, absent the sentence and in
specialized circumstances the visual system has no problem using strategies
that call on powers underlying the other two non-exploited strategies. It’s
only when the visual system is combined with the ANS and with the linguistic most sentence probe that we get the
observed results.
[3]
Actually, I overstate things here: we are able to predict some of the properties of the right representation, e.g. that it
doesn’t exploit negatively specified predicates or disjunctions of predicates.
[4]
Actually, there are several kinds of studies reviewed, only some of which
involve FGEs. Colin also notes EEG studies that show P600 effects when one has
a theta-undischarged Wh and one crosses into an island. I won’t make a big deal
out of this, but there is not exactly a dearth of neuro evidence available for
tracking grammatical distinctions. They
are all over the place. What we don’t have are good accounts of how brains implement grammars. We have
tons of evidence that brain responses track grammatical distinctions, i.e. that
brains respond to grammatical structures. This is not very surprising if you
are not a dualist. After all we have endless amounts of behavioral evidence
(viz. acceptability judgments, FGEs, eye movement studies, etc.) and on the
assumption that human behavior supervenes on brain properties it would be
surprising if brains did not distinguish what human subjects distinguish
behaviorally. I mention this only to state the obvious: some kinds of Platonism
should find these kinds of correlations challenging. Why should brains track
grammatical structure if these live in Platonic heavens rather than
brains? Just asking.
I assume there will be a discussion of this soon: http://www.youtube.com/watch?v=iR_NmkkMmO8
ReplyDeleteI'm not sure if you already had this taken up with you but I can't find it elsewhere in the comments. This seems to be not a good case of SMT in fact. After discussion with Jeff, it sounds like the reverse: the visual system *can* demonstrably compute all the alternatives, and it therefore must be FL imposing the constraint. This contradictory interpretation is evident in your post even. You first write:
ReplyDelete"... They discover that the ANS+visual system always uses one of three possible representations to evaluate these scenes even when use of the others would be both doable and very effective in that context. ...."
[i.e. by the ANS+visual system]
but then write
"... things need not have turned out this way empirically. It could have been the case that despite core capacities of the ANS+visual system the evaluation procedure the interface used when evaluating most sentences was [something different] ..."
[implying that this evaluation procedure is somehow privileged by ANS+V]
- which is it?
The SMT says that optimal ling reps will be perfect matches for the interfaces. If we assume that the predicates and relations optimal in the former are also optimal in the latter we have an instance of the SMT. Ok, what's the optimal representation of the meaning of 'most'? Well there are 3 options PLHH consider. THe first is knocked out if it is optimal either in the interface OR the ling system to use cardinalities. PLHH argue that the ANS+visual system can use either. However, linguistically, one can argue, that cardinalities rule. Why? Because for other determiners (e.g. 'exactly 3' 'four more Xs than Ys' seem to require cardinal measures on the assumption that determiners are generalized quantifiers (as Frege suggested). So given that we need cardinalities anyhow and that they suffice to code the meanings of all known determiners, then assuming that all determiners are use cardinalities is the "optimal" assumption.
ReplyDeleteThis leaves the selective versus subtractive representations. Here, PLHH argue that the subtractive representation fits with the ANS+visual systems most generally, i.e. they argue that the selective system of predicate representations only works in a subset of cases while the subtractive method is fully general. Plus they note that the ANS+visual always tracks the "full" set of dots, and must always track the blue dots and that this is always enough for the general case, i.e. regardless of how many non-blue dots there are. Thus, in the general case, the subtractive representation is optimal given restrictions on the ANS+visual system. Conclusion: the actual representation is the optimal one when considered BOTH from the perspective of the linguistic system and the perspective of the interface.
That's how I see the full argument going. The papers I cited dealt mainly with step 2 (subtractive versus selective representations). The interfaces per se don't choose between the One-to-one rep vs the cardinality reps. But, I think we can argue that here the ling system, wanting to avoid redundancy, will prefer a system that suffices for all linguistically possible determiners and it seems that using cardinalities does indeed suffice. So, Ockham urges cardinalities.
The SMT is about "fit" between FL and interfaces. The fit goes in both directions. You cannot have predicates and operations in either that is not fine in each. I understand PLHH as presenting an argument in which a perfect (or superior) fit holds with a representation in terms of cardinalities subtractively related.
Hope this makes things clear.
okay, I can see that, it looks like the crucial reasoning is
ReplyDelete"the subtractive representation fits with the ANS+visual systems most generally, i.e. they argue that the selective system of predicate representations only works in a subset of cases while the subtractive method is fully general"
- from LHPH (PLHH is the older one):
"[suppose you put different colors of nonblue dots and you are asked if 'most of the dots are blue.] Because the nonblue dots are a heterogenous set, they cannot be attended directly. Moreover, building up the nonblue dots by constructing a disjunctive combination of all nonblue sets is also not a straightforward visual computation. Listeners simply would not be able to
directly attend the heterogeneous set of nonblue dots."
-- yet nevertheless you COULD do something like this, but it would make a particular behavioral prediction. we don't see that, thus it looks like you DON'T do this. yes, okay, I see the extra layer, the SMT layer, which is an explanation of WHY, to wit [what LHPH just said]. that part of the reasoning is optional, and they sound less committed to it than you, but I would rather be committed.
however, now I wonder about the nature of the explanation: I'm imagining some sort of optimization procedure, over all possible types of data, over all possible interfaces: what would be the best fitting meaning, given the interfaces and the problems they will have to solve. this seems like a nasty optimization problem, and, whether it's solved in the phylogeny or the ontogeny, it seems likely that there are going to be some harder cases than this that we need heuristics for. type 1: different interfaces I'm going to have to talk to express different preferences about the optimal meaning; type 2: different stimuli at the same interface express different preferences about the optimal meaning; and, of course, type 3=type 1 + type 2. can we come up with cases like this?
Yes, the last problem you note is a very nasty one, which is why I take the SMT to be more methodological than metaphysical (I tried to say something about this in another post). However, right now, before we wonder about how or whether we have multiple optimization problems, it would be nice to have a couple of SMT examples on the table. This was why I flagged the PLHH stuff and Colin's stuff and Berwick and Weinberg's stuff. These are examples of how the reasoning might be rooted in some real results. At this moment in time, they don't seem to pull in opposite directions. Last point: I think that you read me as saying the the SMT causes the transparency. I doubt it. Rather the SMT suggests we look for these as a way of probing the structure of mental representations. If such exist, they will tell us something interesting. There is a whole other question of WHY such nice mappings exist. My hunch is (and it is even vaguer than a hunch really) is that there has not been enough time for the ling reps to be fitted to the interfaces should they not fit well. Thus, the only ones we see are the ones that fit very well for the others have no contact with FL at all. If this is coherent, then only the well fitting will be visible at all, as the ones that don't fit simply won't have any visible interface effect.
ReplyDeleteHope this helps.