Faculty of Language: Reply To Alex

Thursday, December 6, 2012

Reply To Alex

This is a reply to Alex's reply here. It did not fit into the limited space the comment section makes available. Sorry.

Sigh. Why the bait and switch there Alex? Where't this talk about categories coming from? But let me get there. It seems to me that you really don't get the argument, so let me illustrate it with an example you will be familiar with as I gave it to you before.

The question on the table is whether I am entitled to a domain specific UG built with largely domain general "circuits." Now, a priori this seems reasonable. I can build a "will read windows only" machine using the same chips that will build a "will read OS only" system. The very same chips can be used to exclusively read/use two different programming formats. It's not only disable, it has been done. So the conceptual possibility exists.

In the grammar domain: Say I can show how using Merge and other simple non linguistically proprietary operations I can "derive" the binding theory (I try to show this in 'A Theory of Syntax' but differently (and more idiosyncratically) than I sketch here). Here's the proposal:

(i) If A is antecedent to B then A and B form a constituent.

(ii) Merge in both E and I forms is a basic operation

(iii) Full interpretation holds: A DP must be interpretable at both interfaces, this means bears both a theta role and a case value.

(iv) There is no DS and so movement into theta positions is ok.

(v) Minimality holds of movement

(vi) Extension regulates Merge

The net effect of (i)-(v) is to have reflexivization "live on" A-chains. A-chain properties follow from minimality, Extension, and full interpretation. I take these latter two properties to reflect domain general/computationally general features of FL and so NOT special to FL (this may be wrong, but I argue for it, so let me get away with that here).

The effect of having reflexivization live on A-chains derives binding principle A (this is easy to see given the LGB relation between NP-trace and movement via binding theory. I reverse the relation relating them via movement theory). The locality follows from (iii) and (v). The C-command condition holds from (vi).

Say for purposes of discussion this indeed derives Principle A of the binding theory as I said. Now what does the kid have to learn to master principle A? Well all but the fact that 'himself' is spelled out as the tail of the chain is "given." So that's what the kid has to "learn," i.e. that reflexives are spell outs of A-chain tails (roughly the old Lees-Klima account in gussied up form). Note, as I indicated in a reply to an earlier question, this is all the kid has to learn on the GB theory as well (i.e. that reflexives fall under A). The same thing. This is not surprising as if successful we have derived principle A as the product of Merge plus these other principles. In other words, if I reduce Principle A to movement theory, then if FL is structured as the reducing picture envisages I am in the same position I was in wrt Plato's problem and Binding Theory as I was in the GB era. The answer to Plato's problem has not changed. The information is domain specific though the computational circuits used to build the FL circuit board that embodies the competence are largely domain general (i.e. circuits and properties available domain generally) in their properties and modes of operation.

Now, I am not saying that this is correct (though I do like it). I am asking ASSUMING IT OR SOMETHING LIKE IT CAN BE DONE whether the fact of a Minimalist reduction means that all learning is domain general and the answer I give is no if you see that the Minimalist proposal is not competitor to the GB one but an attempt to place it on more solid foundations. So there's my eaten cake and I plan another big helping.

Now your categorization question:MP (and GB for that matter) had very little to say about words and their categories (i.e. the generalizations adduced were not nearly as impressive as what we had to say about syntax IMHO). Thus, what I said did not address these questions. Truth be told, IMHO we know very little about the intricacies of word learning and the innate knowledge required to get it off the ground. Chomsky's discussion of these matters (riffing on Austin and the later Wittgenstein) is fascinating but so far theoretically inconclusive. So, the short answer is that NOTHING I KNOW ABOUT MP HAS ANYTHING ENLIGHTENING TO SAY ABOUT THIS. I also know that Chomsky believes the same thing. So, as far as I can tell, we have no answer to this question from an MP point of view. However, most arguments for rich UG were made using syntactic facts like those GB and MP do deal with so the fact that we have no MP story here strikes me as of little relevance.

In sum, what you are pointing out is that there are other important poorly understood questions. Yup, many. Do these require domain specific innate knowledge? Who knows? I am not being entirely flippant (though I am being a teensy bit). Here's why. MP makes sense because we have theories like GB. Till GB came up with its laws of grammar the question of how to reduce them to simple principles was way premature. Ok, what do we know about word learning and categorization that comes even close to being interesting. Not much. So the Minimalist question is entirely out of place. The thing about research questions is that they make sense in some areas and not in others. They make sense for syntax and so we are making some interesting progress in answering them there. I have no reason to think that they make sense for the problems you mention and so am not surprised that there is not much to say. Of course, should categorization and word acquisition be subject to domain general procedures, I would be delighted. If not, I would start to ask what makes it possible and how much domain specificity we need. But, till we have interesting "laws" here I will refrain from indulging minimalist confabulations.

24 comments:

Alex ClarkDecember 7, 2012 at 9:25 AM
Thanks for the expansion .. I need to think a bit more about this before I reply properly as I still don't understand the learning proposal properly. Is your recent book a good place to look for the details?

One of the things I am starting to realise is that the range of phenomena that the domain-general learning crowd (people like me) are interested in is almost completely disjoint from the phenomena that you are interested in. We tend to focus on things like lexical category learning, morphology, constituent structure and so on.
Whereas these seem to be taken as given in the descriptions you give of how it works in your theories.
So that partly explains why we often seem to be talking past each other.

My view is that if one can come up with a good story of how one might learn constituent structure in the simple cases where it is undisplaced ('the cat sat on the mat' type examples) then that might go some way towards an explanation of the acquisition of the displaced case ('which mat did the cat sit on ?').
So one starts with the easy problems and then moves onto the hard stuff later.
Whereas linguists seem only to be interested in the harder problems -- movement, island effects, the binding theory. This is not a criticism just an observation.
ReplyDelete
Replies
NorbertDecember 7, 2012 at 10:41 AM
Could be that we are talking past each other. It is worth recalling that some rather arch rationalists, e.g. Chomsky, had no problem with using general learning systems to lever oneself into the system, e.g. transitional probabilities for word learning, maybe word categorization. I am much more skeptical about basic phrase structure if one wants to not only get 'John saw the lady' but also 'The boy from Montreal that I met last week while playing baseball recruited a lady for cheerleader that he met the day before while she was at the automat.' The problem is not simple categorization but one that allows for the recursive embedding of structure into structure. I have read a few things on learning "phrases" but they never seem to deal with the phrase within phrase within phrase issue and this is the serious one for people like me. Do you know of anything?

As to what linguists care about: there has been a lot of work on two kinds of phenomena: generalizations in "exotic" circumstances (long distance dependencies) and the absence of acceptability in simple circumstances (why can't we say 'John did leave' with unstressed 'do'. The gaps and the exotica have dominated the field precisely because these look like they are not data driven effects. that's where POS arguments live and were born.

As for my book: always a good idea to look there, buy it for friends (makes a lovely holiday gift), and buy multiple copies for every room in the house. Actually serves as a very good coaster,
ReplyDelete
Replies
AveryAndrewsDecember 8, 2012 at 4:01 PM
My thought is that people might get along better & the field might have better chances of getting through the probably fast approaching implosion of the university system if the 'nativists' presented themselves fundamentally as purveyors of interesting problems for the empiricists to try to solve some day, omitting the not very useful side remarks about how unlikely they were to be able to solve them.
ReplyDelete
Replies
UnknownDecember 9, 2012 at 10:19 AM
Interesting points Norbert. You use the example:
[1] 'The boy from Montreal that I met last week while playing baseball recruited a lady for cheerleader that he met the day before while she was at the automat.

Can you let me know what the minimalist analysis for this sentence is and als how it differs from say

[2] 'The Montreal boy from I that met last week while recruited playing baseball a cheerleader that he met the lady for day at the automat before while she was.

I would imagine that your model can generate [1] but not [2] - so an explanation of how this is accomplished would be very helpful.
ReplyDelete
Replies
NorbertDecember 9, 2012 at 2:24 PM
I have no idea what you are asking. This is a typical derivation of multiple merge plus moves. If you want some more bells and whistles one can throw in phase based access to numeration a given the various relative clauses. Are you asking about selection and subcat features? If so, I assume just the derivation of 'man the dog chased the' is enough. The latter is underivable via the combination of subcat info ('the' needs a nominal complement) and linearization conventions as per the LCA, so we get 'the man' not 'man the.' Is this what you mean? So assume standard subcat and LCA and merge plus move meeting feature specifications and these are easy to derive. Am I missing something?
ReplyDelete
Replies
UnknownDecember 12, 2012 at 6:03 AM
I wanted the complete unambiguous derivation not just the 'how to' manual [empiricists claim they have that 'in principle']. But maybe this is not a good format for such a question so i have a different one for you, which I am sure can be answered here. You use a perfectly grammatical [well formed] sentence above:

[3] I assume just the derivation of 'man the dog chased the' is enough.

But you also tell me that 'man the dog chased the' is underivable. So how do you derive [3]?
ReplyDelete
Replies
NorbertDecember 12, 2012 at 2:21 PM
[3]? Do you mean 'the man chased the dog'?
Here's how this is derived:
a. Select 'the', 'dog', Merge them [the dog] (on merge check subcat/selection. They match as 'the' can select/subcat for N, i.e. 'dog.' Label with 'the', i.e. D (I'll indicate this with a * on label/head)
b. Select 'chased': Merge with output of previous line checking selection/subcat features: get [chased [the* dog]].Label with 'chased', i.e. V: [chased* [the* dog]]
As separate subtree do for 'the man' what you earlier did for 'the dog.' get [the* dog]
Merge [the* dog] with [chased* [the* dog]]checking relevant features. Get [[the* dog] [chased**[the*cat]] (** indicates chased projected)
That's it modulo some functional categories but the same steps of merge and checking hold (maybe an I-merge) for case. If you do this get:
[[The* man] T* [[the*man] [chased** [the* dog]]]
Now need to linearize. If assume LCA and multiple spell out (either Uriagereka's version or Chomsky's with phase based SO and D a phase or Kayne's with specifiers as adjuncts, get
'The man chased the dog'.

Does this help? What prevents 'man the dog chased the'? Well lower 'the' needs an N which it doesn't have or the N has moved for no apparent reason. FOr first DP, linearization is screwed up as 'the' should precede the NP (might need little n here technologically but Kayne would get this for free. So, a combo of feature checking and LCA gets the right derivation and blocks the bad one. Does this help?
ReplyDelete
Replies

Add comment