Thursday, June 19, 2014

Comments on lecture 2; first part

This was once a 10 page post. I’ve decided to break it into two to make it more manageable. I welcome discussion as there is little doubt that I got many things wrong. However, it’s been my experience that talking about Chomsky’s stuff with others, even if it begins in the wrong place, ends up being very fruitful. So engage away.

In lecture 2, Chomsky starts getting down to details.  Before reviewing these, however, let me draw attention to one of Chomsky’s standard themes concerning semantics, with which he opens.  He does not really believe that semantics exists (at least as part of FL). Or more accurately, he doubts that there is any part of FL that recursively specifies truth (or satisfaction) conditions on the bases of reference relations that lexical atoms have to objects “in the world.”

Chomsky argues that lexical atoms within natural language (viz. words, more or less) do not refer.  Speakers can use words to refer, but words in natural languages (NL) have no intrinsic reference relation to objects or properties or relations or qualities or whatever favorite in the world “things” one cares to name.  Chomsky interestingly contrasts word with animal symbols, which he observes really do look like they fit the classical referential conception as they are tightly linked to external states or current appetites on every occasion of use. As Chomsky has repeatedly stressed, this contrast between our “words” and animal “words” needs explaining, as it appears to be a distinctive (dare I say species specific) feature of NL atoms.

Interestingly (at least to me), the point Chomsky makes here echoes ideas in the Wittgenstein’s (W) later writings. Take a look at W’s slab language in the Investigations. This is a “game” in which terms are explicitly referentially anchored. This language has a very primitive tone (a point that W wants to make IMO) and has none of the suppleness characteristic of even the simplest words in NL.  This resonates very clearly with Chomsky’s Aristotelian observations about how words function.

Chomsky’s pushes these observations further. If he is right about the absence of an intrinsic reference relation between words and the world and that words function in a quasi Aristotelian way, then semantics is just a species of syntax, in that it specifies internal relations between different types of symbols.  Chomsky once again (he does this here for example) urges an analogy with phonological primitives, which also have no relations to real world objects but can be used to create physical effects that others built like us can interpret. So, no semantics, just various kinds of syntax and some pragmatics describing how these different sets of symbols are used by speakers. 

Two remarks and we move on to discuss the meat of the lecture: (i) Given Chomsky’s skepticism concerning theories of use, this suggests that there is unlikely to be a “theory” of how linguistic structures are used to “refer,” make assertions, ask questions etc.  We can get informal descriptions that are highly context sensitive, but Chomsky is likely skeptical about getting much more, e.g. a general theory of how sentences are used to assert truths. Interestingly, here too Chomsky echoes W. W noted that there are myriad language games, but he doubted that there could be theories of such games. Why? Because games, W observes, are very loosely related to one another and a game’s rules are often constructed on the fly. 

With very few exceptions semanticists, both linguists and philosophers, have not reacted well to these observations. Most of the technology recursively specifies truth conditions based on satisfaction conditions of predicates. There is a whole referentialist metaphysics based on this. If Chomsky is right, then this will all have to be re-interpreted (and parts scrapped). So far as I know, Paul Pietroski (see here) is unique among semanticists in developing interpretive accounts of sentence meaning not based on these primitive referential conceptions.

Ok, let’s now move onto the main event. Chomsky, despite his standard comments noting that Minimalism is a program and not a theory, outlines a theory that, he argues, addresses minimalist concerns.[1] The theory he outlines aims to address Darwin’s Problem (DP). In reviewing the intellectual lay of the land (as described more fully in lecture 1) he observes that FL arose quickly, all at once, in the recent past, and has remained stable ever since.  He concludes from this that the change, whatever it was, was necessarily “simple.” Further, Chomsky specifies the kinds of things that this “simple” change should account for, viz. a system with (at least) the following characteristic:

(i)             it generates an infinite number of hierarchically structured objects
(ii)           it allows for displacement
(iii)          it displays reconstruction effects
(iv)          its operations are structure dependent
(v)           its operations apply cyclically
(vi)          it can have lots of morphology
(vii)        in externalization only a single “copy” is pronounced

Chomsky argues that these seven properties are consequences of the simplest conceivable conception of a recursive mechanism. Let’s follow the logic.

Chomsky assumes that whatever emerged had to be “simple.” Why? One reason is that complexity requires time, and if the timeline that experts like Tattersall have provided is more or less correct, then the timeline is very short in evo terms (roughly 50-100k years). So whatever changed occurred must have been a simple modification of the previous cognitive system. Another reason for thinking it was simple is that it has been stable since it was first introduced. In particular, human FLs have not changed since humans left Africa and dispersed across the globe about 50-100k years ago. How do we know? Because any human kid acquires any human language in effectively the same way. So, whatever the change was, it was simple.

Next question: what’s “simple” mean? Here Chomsky makes an interesting (dare I say, bold?) move. He equates evolutionary simplicity with conceptual simplicity. So he assumes that what we recognize as conceptually simple corresponds to what our biochemistry takes to be simple. I say that this is “interesting/bold” for I see no obvious reason why it need be true. The change was “simple” at the genetic/chemical level. It was anything but at the cognitive one.  Indeed, that’s the point; a small genetic/biochemical change can have vast phenotypic effects, language being the parade case. However, what Chomsky is assuming, I think, is that the addition of a simple operation to our cognitive inventory will correspond to a simple change at the genetic/developmental level.[2] We return to this assumption towards the end.

As is well known, Chomsky’s candidate for the “simplest” change is the addition of an operation that “takes two things already (my emphasis NH) constructed and forms a new thing from them” (at about 28;20). Note the ‘already.’ The simplest operation, let’s call it by its common name- “Merge,” does not put any two things together. It puts two constructed things together. We return to this too.

How does it put them together? Again, the simplest operation will leave the combinees unchanged in putting them together (it will obey the No Tampering Condition (NTC)) and the simplest operation will be symmetric (i.e. impose no order on the elements combined).[3]  So the operation will be something like “combine A and B,” not “combine A with B.” The latter is asymmetric and so imposes a kind of order on the combiners.  The Merge so conceived can be represented as an operation that creates sets. Sets have both the required properties. Their elements are unordered and putting things into sets (i.e. taking things elements of a set) does not thereby change the elements so besetted.[4]

We have heard this song before. However, Chomsky puts a new spin on things here. He notes that the “simplest” application of Merge is one where you pick an expression X that is within another expression Y and combine X and Y.  Thus I(nternal)-Merge is the simplest application/instance of Merge. The cognoscenti will recognize that this is not how Chomsky elaborated things before. In earlier versions, taking two things neither of which was contained in the other and Merging them (viz. E-merge) was taken to be simpler.  Not now, however. Chomsky does not go into why he changes his mind, but he hints that the issue is related to “search.” It is easier to “find” a term within a term than to find two terms in a workspace (especially one that contains a lexicon).[5]  So, the simplest operation is I-merge, E-merge being only slightly more complex, and so also available.

Comments: I found this discussion a bit hard to follow. Here’s why. A logical precondition for the application of I-merge is the existence of structured objects and many (most) of these will be products of E-merge. That would seem to suggest that the “simplest” version of the operation is not the conceptually most basic as it logically presupposes that another operation exist.  It is coherent to assume that even if E-merge is more conceptually basic, I-merge is easier to apply (think search). But if one is trucking in conceptual simplicity, it sure looks like E-merge is the more basic notion. After all, one can imagine derivations with E-merges and no I-merges but not the reverse.[6] Clearly we will be hearing more about this in later lectures (or so I assume). Note that this eliminates the possibility of Economy notions like “merge over move” (MoM). This is unlikely to worry Chomsky given the dearth of effects regulated by this MoM economy condition (Existential constructions? Fougetaboutit!).[7] Nonetheless, it is worth noting. Indeed, it looks like Chomsky is heading towards a conception more like “move over merge” or “I over E merge” (aka: Pesetsky’s Earliness principle), but stay tuned.

Chomsky claims that these are the simplest conceivable pair of operations and so we should eschew all else.[8] Some may not like this (e.g. moi) as it purports to eliminate operations like inter-arboreal/sidewards Merge (where one picks a term within one expression and merges it with a term from the lexicon). I am not sure, however, why this should not be allowed. If we grant that finding mergeables in the lexicon is more complex than finding a mergeable within a complex term, then why shouldn’t finding a term within a term (bounded search here) and merging it with a term from the lexicon not be harder than I-merge but simpler than E-merge?  After all, for interarboreal merge we need scour the big vast nasty lexicon but once rather than twice, as is the case with many case of E-merge (e.g. forming {the,man}). At any rate, Chomsky wants none of this, as it goes beyond the conceptually simplest possibilities.

Chomsky also does not yet mention pair merge, though in other places he notes that this operation, though more complex than set merge (note: it does imply an ordering, hence the ‘pair’ in (ordered?) pair merge) is also required.  If this is correct, it would be useful to know how pair merge relates to I and E merge: is it a different operation altogether (that would not be good for DP purposes as we need keep miracles to a small minumum) and where does it sit in the conceptual complexity hierarchy of merge operations? Stay tuned.

So, to return to the main theme, the candidate for the small simple change that occurred is the arrival of Merge, an operation that forms new sets of expressions both from already constructed sets of expressions (I-merge) and from lexical items (which are themselves atomic, at least as far as merge is concerned) (E-merge).  The strong minimalist thesis (SMT) is the proposal that these conceptual bare bones suffice to get us many (in the best case, all) of the distinctive properties of NL Gs.  In other words, that the conceptually ”simplest” operation (i.e. the one that would have popped into our genomes/developmental repertoires if anything did) suffices to explain the basic properties of FL. Let’s see how merge manages this.

Recall that Merge forms sets in accord with the NTC. Thus, it can form bigger and bigger (with no bound to how big) hierarchically structured objects. The hierarchy is a product of the NTC. The recursion is endemic to Merge. Thus, Merge, the “simplest” recursive operation, suffices to derive (i) above (i.e. the fact that NLs contain an infinite number of hierarchically structured objects).

In addition, I-merge models displacement (an occurrence of the same expression in two different places) and as I-merge is the simplest application of Merge, we expect any system built on Merge to have displacement as an inherent property (modulo AP deletion, see next post).[9]

We also expect to find (iii) reconstruction effects for Merge plus NTC implies the copy theory of movement. Note, that we are forming sets, so when we combine A (contained in B) with B via merge we don’t change A (due to NTC) and so we get another instance of A in its newly merged position.  In effect, movement results in two occurrences of the same expression in the two places.  These copies suffice to support reconstruction effects so the simplest operation explains (iii), at least in part (see note 10).[10]

(iv) follows as well. The objects created have no left/right order, as the objects created are sets and sets have no order at all, and so no left/right order.[11] This means that operations on such set theoretic structures cannot exploit left/right order as such relations are not defined for the set theoretic objects that are the objects of syntactic manipulation. Thus, syntactic operations must be structure dependent as they cannot be structure independent.[12]

This seems like a good place to stop. The discussion continues in the next post where I discuss the last three properties outlined above.

[1] Chomsky would argue, correctly, that should his particular theory fail then this would not impugn the interest of the program.  However, he is also right in thinking that the only way to advance a program is by developing specific theories that embody its main concerns.
[2] I use ‘genetic/developmental’ as shorthand for whatever physical change was responsible for this new cognitive operation. I have no idea what the relation between cognitive primitives and biological primitives is. But, from what I can tell, neither does anyone else. Ah dualism! What a pain!!
[3] We need to distinguish order from a left/right ordering. For example, in earlier proposals, labels were part of Merge. Labels served to order the arguments: {a,{a,b}} is equivalent to the ordered pair <a,b>. However, Merge plus label does not impose a left-right ordering on ‘a’ and ‘b’. Chomsky in this lecture explicitly rejects a label based conception of Merge so he is arguing that the combiners are formed into simple sets, not ordered sets. The issue about ordering, then, is more general than whether Merge, like earlier Phrase Structure rules in GG, imposes a left-right order on the atoms in addition to organizing them into “constituents.”
[4] If it did, we could not identify a set in terms of the elements it contains.
[5] I heard Chomsky analogize this to finding something in your pocket vs finding it on your desk, the former being clearly simpler. This clearly says something about Chomsky’s pockets versus his desks.  But substitute purses or school bags for pockets and the analogy, at least in my case, strains. This said, I like this analogy better than Chomsky’s old refinery analogy in his motivation of numerations.
[6] Indeed, one can imagine an FL that only has an operation like E-merge (no I-merge) but not the converse.  Restricting Merge to E-merge might be conceptually ad hoc, As Chomsky has argued before, but it is doable. A system with I-merge alone (no E-merge at all) is, at least for me, inconceivable.
[7] I know of only three cases where MoM played a role: Existential constructions, adjunct control and the order of shifted objects and base generated subjects. I assume that Chomsky is happy to discount all three, though a word or two why they fail to impress would be worth hearing given the largish role MoM played in earlier proposals. In particular, what is Chomsky’s current account of *there seems a man to be here?
[8] Chomsky talks as if these are two different operations with one being simpler than the other. But I doubt that this is what he means. He very much wants to see structure building and movement as products of the same single operation and that on the simplest story, if you get one you necessarily get the other. This is not what you get if the two merge operations are different, even slightly so. Rather I think we should interpret Chomsky as saying that E/I-Merge are two applications of the same operation with the application of I-merge being simpler than E-merge.
[9] What I mean is that I-merge implies the presence of non-local dependencies. It does not yet imply displacement phenomena if these are understood to mean that an expression appears at AP in a postion different from where it is interpreted at CI. For this we need copy deletion as well as I-merge.
[10] Actually, this is not quite right. Reconstruction requires allowing information from lower copies to be retained for binding. This need not have been true. For example, if CI objects were like the objects that AP interprets, the lower copies would be minimized (more or less deleted) and so we would not expect to find reconstruction effects.  So what Merge delivers is a necessary (not a sufficient) condition for reconstruction effects. Further technology is required to actually deliver the goods. I mention this, for any additional technology must find its way into the FL genome and so complicates DP.  It seems that Chomsky here may be claiming a little more for the simplest operation than Merge actually delivers. In Chomsky’s original 1993 paper, Chomsky recognized this. See his discussion of the Preference Principle, wherein minimizing the higher copy is preferred to minimizing the lower one.
[11] As headedness imposes an ordering on the arguments (it effectively delivers ordered pairs), headedness is also excluded as a basic part of the computational system as it does not follow from the conceptually “simplest” possible combination operation. I discuss this a bit more in the next post.
[12] Note, that we need one more assumption to really seal the deal, viz. that there are no syntax like operations that apply after Transfer. Thus, there can be no “PF” operations that move things around. Why not? Because Transfer results in left-right ordered objects. Such kinds of operations were occasionally proposed and it would be worth going back to look at these cases to see what they imply for current assumptions.

No comments:

Post a Comment