Wednesday, January 15, 2014

Jeff W comments on comments on recursion

I asked Jeff Watumall to respond to some of the points made concerning our previously flagged paper. He was the real driving force behind our joint effort. Thx Jeff. Here's what Jeff has to say.


On “On Recursion”

Our paper ( has generated interesting discussion in a previous post (  Here I comment on those comments.

Turing and Gödel:

It is no error to equate Turing computability with Gödel recursiveness.  Gödel was explicit on this point (I am quoting from numerous Gödel papers in his Collective Works; I can furnish references if requested): “A formal system can simply be defined to be any mechanical procedure for producing formulas, called provable formulas[...].  Turing’s work gives an analysis of the concept of ‘mechanical procedure’ (alias ‘algorithm’ or ‘computation procedure’ or ‘finite combinatorial procedure’).  This concept is shown to be equivalent with that of a ‘Turing machine.’”  It was important to Gödel that the notion of formal system be defined so that his incompleteness results could be generalized: “That my [incompleteness] results were valid for all possible formal systems began to be plausible for me[.]  But I was completely convinced only by Turing’s paper.”  This clearly holds for the primitive recursive functions: “[primitive] recursive functions have the important property that, for each given set of values of the arguments, the value of the function can be computed by a finite procedure.”  And even prior to Turing, Gödel saw that “the converse seems to be true if, besides [primitive] recursions [...] recursions of other forms (e.g., with respect to two variables simultaneously) are admitted [i.e., general recursions].”  However, pre-Turing, Gödel thought that “[t]his cannot be proved, since the notion of finite computation is not defined, but it serves as a heuristic principle.”  But Turing proved the true generality of Gödel recursiveness.  As Gödel observed: “The greatest improvement was made possible through the precise definition of the concept of finite procedure, which plays a decisive role in these results [on the nature of formal systems].  There are several different ways of arriving at such a definition, which, however, all lead to exactly the same concept.  The most satisfactory way, in my opinion, is that of reducing the concept of finite procedure to that of a machine with a finite number of parts, as has been done by the British mathematician Turing.”  Elsewhere Gödel wrote: “In consequence of [...] the fact that due to A.M. Turing’s work a precise and unquestionably adequate definition of the general notion of formal system can now be given, a completely general version of Theorems VI and XI [of the incompleteness proofs] is now possible.” 

Intension and Extension:

Properly formulated formal systems can be understood as intensionally and extensionally equivalent to Turing machines.  In such systems the axiomatic derivations correspond to the elementary computation steps (e.g., reading/writing); this is as constructive as a Turing machine.  (There exists a machine that directly performs derivations in the formal system rather than encoding the information in binary strings to be manipulated by the machine.)  Accordingly, Gödel did not see formal systems and Turing machines as simply extensionally equivalent: a formal system is as constructive as a proof: “We require that the rules of inference, and the definitions of meaningful formulas and axioms, be constructive; that is, for each rule of inference there shall be a finite procedure for determining whether a given formula B is an immediate consequence (by that rule) of given formulas A1, ..., An[.]  This requirement for the rules and axioms is equivalent to the requirement that it should be possible to build a finite machine, in the precise sense of a ‘Turing machine,’ which will write down all the consequences of the axioms one after the other.”  This equivalence of formal systems with Turing machines established an absoluteness: “It may be shown that a function which is computable in one of the systems Si or even in a system of transfinite type, is already computable in S1.  Thus, the concept ‘computable’ is in a certain definite sense ‘absolute,’ while practically all other familiar metamathematical concepts depend quite essentially on the system with respect to which they are defined.”  Gödel saw it as “a kind of miracle that”, in this equivalence of computability and recursiveness, “one has for the first time succeeded in giving an absolute definition of an interesting epistemological notion, i.e., one not depending on the formalism chosen.”  Emil Post went further into ontology: The success of proving these equivalences raises Turing-computability/Gödel-recursiveness “not so much to a definition or to an axiom but to a natural law” (Post 1936: 105).  As a natural law, computability/recursiveness applies to any computational system, including a generative grammar.

Rules and Lists:

The important aspect of the recursive-function/lookup-table distinction is not computability per se (table look-up is trivially computable) but explanation.  A recursive function derives--and thus explains--a value.  A look-up table stipulates--and thus does not explain--a value.  (The recursive function establishes epistemological and ontological foundations.)  Turing emphasized this distinction, with characteristic wit, in discussing “Solvable and Unsolvable Problems” (1954).  Imagine a puzzle-game with a finite number of movable squares. “Is there a systematic way of [solving the puzzle?]  It would be quite enough to say: ‘Certainly [b]y making a list of all the positions and working through all the moves, one can divide the positions into classes, such that sliding the squares allows one to get to any position which is in the same class as the one started from.  By looking up which classes the two positions belong to one can tell whether one can get from one to the other or not.’  This is all, of course, perfectly true, but one would hardly find such remarks helpful if they were made in reply to a request for an explanation of how the puzzle should be done.  In fact they are so obvious that under the circumstances one might find them somehow rather insulting.”  Indeed.  A look-up table is arbitrary; it is equivalent to a memorized or genetically preprogrammed list.  This may suffice for, say, nonhuman animal communication, but not natural language.  This is particularly important for an infinite system (such as language), for as Turing explains: “A finite number of answers will deal with a question about a finite number of objects,” such as a finite repertoire of memorized/preprogrammed calls.  But “[w]hen the number is infinite, or in some way not yet completed[...], a list of answers will not suffice.  Some kind of rule or systematic procedure must be given.”  Gallistel and King (2009: xi) follow Turing’s logic: “a compact procedure is a composition of functions that is guaranteed to generate (rather than retrieve, as in table look-up) the symbol for the value of an n-argument function, for any arguments in the domain of the function.  The distinction between a look-up table and a compact generative procedure is critical for students of the functional architecture of the brain.  One widely entertained functional architecture, the neural network architecture, implements arithmetic and other basic functions by table look-up of nominal symbols rather than by mechanisms that implement compact procedures on compactly encoded symbols.”

Iteration and Tail Recursion:

This is mathematics, not computer science.  (Or, rather, I am a mathematician, now interloping in linguistics.  In mathematics, iteration--a general notion applicable to a pattern of succession--is seen as a form of recursion: the function f is defined for an argument x by a previously defined value (e.g., f(y), y < x); but iteration is “tail” recursion given that the previously defined value y is the immediately previously define value.)  We are on the computational level, not the level of mechanisms.  It is important to recall that Marr and Nishihara (1978) distinguished four--not three--levels: “At the lowest, there is the basic component and circuit analysis--how do transistors (or neurons), diodes (or synapses) work?  The second level is the study of particular mechanisms: adders, multipliers, and memories, these being assemblies made from basic components.  The third level is that of the algorithm, the scheme for a computation; and the top level contains the theory of computation.”  (The theory of computation is mathematical.)  Much of the muddling of iteration and tail recursion in the comments on the previous post is the result of misclassifying the level of analysis.  “[W]e may consider the study of grammar and UG to be at the level of the theory of computation” (Chomsky 1980: 48).  Thus discussion of loops, arrays, etc. is irrelevant.  In fact, algorithms and mechanisms are arguable irrelevant in principle.  We concur with Chomsky that, for the computational system of language “there’s no algorithm for the system itself; it’s kind of a category mistake.  [T]here’s no calculation of knowledge; it’s just a system of knowledge[...].  You don’t ask the question what’s the process defined by Peano’s axioms and the rules of inference, there’s no process” (Chomsky 2013a).  Analogously, a Turing machine is not a description of a process or algorithm or mechanism but “a mathematical characterization of a class of numerical functions” (in the words of Martin David (1958: 3), one of the founders of computability theory).  Thus to define the faculty of language as a type of Turing machine as we did in our paper, “On Recursion,” is to give a function: “a finite characterization of an infinite set” (Chomsky 2013b).  A Turing machine--and thus the language faculty--is defined by a tuple containing a finite set of symbols (axioms), a set of states (with “states” defined as “structures” in the sense of mathematical logic), and a transition function (rule of inference) mapping from state/symbol to state/symbol.  “A derivation is thus roughly analogous to a proof with Σ,” a finite set of initial symbols, “taken as the axiom system and F,” the finite set of rewrite rules (or Merge), “[taken] as the rules of inference” (Chomsky 1956: 117), consistent with Gödel’s characterization: “We require that the rules of inference, and the definitions of meaningful formulas and axioms, be constructive; that is, for each rule of inference there shall be a finite procedure for determining whether a given formula B is an immediate consequence (by that rule) of given formulas A1, ..., An[.]  This requirement for the rules and axioms is equivalent to the requirement that it should be possible to build a finite machine, in the precise sense of a ‘Turing machine,’ which will write down all the consequences of the axioms one after the other.


  1. Regarding iteration and tail recursion, I think it is rather uncharitable to locate the “muddle” in the minds of the commentators. The term “tail recursion” is one that is standardly used in a computer science context with reference to particular computational mechanisms. (Tail recursion is an interesting special case primarily because it can be implemented in a space-efficient way.) Using the term “tail recursion” in a mathematical context without defining it is bound to lead to confusion. Your response still doesn’t give a definition of tail recursion, so I still don’t know what you mean when you say that a given function is or is not tail recursive. I don’t want to speak for other commentators, but this issue seems to have puzzled a couple of people who have a strong background in mathematics and computer science. (Also, I don’t know which function you’re talking about in the case of Pirahã.)

    1. To add to this: the muddling of levels of analysis rests in specifying something at a computational level and then insisting that the intension matters. Logically the intension can in no way matter at this level at the matter of specification of inputs and outputs (naturally apart from what those inputs and outputs are; I'm talking about the intension of the specification itself, which is what you seem to be referring to - although this is itself muddled in the paper). The intension of the grammar itself _can_ matter if you give it something particular to explain, like relating it to parsing complexity (e.g. Stabler, Berwick and Weinberg) or learning (e.g. the evaluation metric program in LSLT, Aspects, etc), i.e., if you actually give it a hook into the algorithmic level. The view here seems to be that the recursion/iteration distinction matters except actually we aren't allowing them any room to matter, and I simply don't see how this is coherent.

    2. To goad a bit: "this is mathematics, not x" strikes me as a rather un-Hornsteinian position to be taking, no?

    3. “Assume that the process is Merge. Take a word and combine it with another word. Then combine the result of that operation with another word. This is neither using the same word over and over (it is using separate tokens) nor recursive -- it is iterative” (Everett 2012: 4). This is misleading, for this process is technically recursive: the value of Merge at step n is defined by the value at step n-1 (i.e., it is a definition by recursion/induction.) This is tail recursion (in the mathematical sense) because the value of n is a function only of n-1 (i.e., the “tail” of the derivation, c.f., “recursive”/“iterative” implementations of the factorial function).

    4. (N.B. The Pirahã process is in fact not merely tail recursive, because the value of Merge at n is not only a function of the value of n-1, but I assumed it for the sake of argument.)

    5. I'm still not sure exactly what is meant by "tail recursion in the mathematical sense" or whether it applies to functions considered in intension or in extension.

    6. So, am I understanding correctly that according to the terminology used in the paper, the Fibonacci-series could never be defined through a tail-recursive algorithm, because it is defined as Fib(n+2) = Fib(n) + Fib(n+1)? (ignoring base-cases)

      If so, this is a somewhat misleading ("non-standard"?) use of "tail-recursion", as any first-year comp-sci student will tell you.

    7. This is a reply to Jeff Watumull's claims about Everett's [2012] paper [which JW calls misleading]:

      What IS misleading is how the quote has been taken out of context. In the paper Everett quotes a couple of authors (mathematicians) on how one could interpret Merge as recursive or merely iterative (their terms) and he says that for the sake of discussion he considers it recursive. However, he also says that if we so take it, it makes no predictions...

  2. This comment has been removed by the author.

  3. "Thus to define the faculty of language as a type of Turing machine as we did in our paper, “On Recursion,” is to give a function: “a finite characterization of an infinite set” (Chomsky 2013b). "

    - I always thought defining language (or the faculty of language, or whatever) as an X and then drawing conclusions about the thing thus defined (in particular about certain sets thus defined) was something Chomsky(ans) accused their critics of, not something Generative Grammarians were doing. Perhaps the use of "define" here is just somewhat unfortunate, but even then...

    - as Alex C said in the other thread,

    "Essentially everyone in cognitive science from Turing (1950) onwards has assumed that the human mind/brain and indeed animal mind/brains are computable. This is an upper bound on the power of the brain." (source, my emphasis)

    So what's the informative content of "The FLN is (like) a Turing Machine", in particular if we really take care to stick to the purely "computational" level of input-output mapping specification?

    1. Viewing the human (or any) mind/brain as 'computable' strikes me a questionable although very popular idea because its fundamental job is to catch up with stuff that it wants to eat, etc. and not be caught up with by stuff that wants to eat it; for the first case, termination isn't guaranteed; in the second, not desired, if unavoidable for extrinsic reasons (comparable to the production of a well-formed grammatical sentence being disrupted by some real world event). So some other mode of description seems called for, suitable for processes that in the ideal case can go on forever.

      Perhaps coalgebras? For example, in a language with 'consecutive clause' structures, where an indefinitely long string of 'nonfinal clauses' ends with a 'final clause' with some different morphology, each clause encoding an event in a sequence, it's not 100% obvious that a well-formed utterance must end; in principle you could imagine an indefinitely long lived individual saying something like 'a proton was created, and then another one was created, and then another one was created, ...'. This is analogous to one of the basic coalgebra examples where what is described is a sequence of 'a's that either goes on forever or terminates with a 'b'.

      Bottom line: Turing/Church might be subtly misleading.

    2. Avery, I'm not sure I fully understand your point, but the computability restriction is a pretty weak one, and definitely not at odds with the behavior your describe. What it rules out is that the brain works like an oracle machine, i.e. a Turing Machine with an oracle that it can query to get the answer to specific problems that the Turing Machine cannot solve by itself. One could speculate that evolution has endowed the human brain with an oracle, but I don't quite see how that would produce an oracle that's more powerful than a Turing Machine, nor how this extra power manifests itself in real life.

    3. Actually, thinking some more there are several people who think/thought that the brain might be more than computable. Godel for example being one, Roger Penrose being another. I think J.R. Lucas also argued along similar lines. But I don't think these views have much traction nowadays.

    4. The issue is not that syntax might be more than computable, but that recursive computability is actually irrelevant to what the brain does in general, because it's not starting with a finite blob of data and grinding away until it stops with an answer (or not), but managing a continuing stream of events, like an operating system or GUI program. So coalgebras are a math concept that some computer scientists use to study such things; tuturial here:
      ('A Tutorial on (Co)Algebras and (Co)Induction'), textbook draft here:, 3rd item in the books section (_Introduction to Coalgebra. Towards Mathematics of States and Observations_).

      Syntax is plausibly recursive, and coalgebras are likely irrelevant to it (but not completely obviously so), but discourse might be different.

  4. I wonder at the reviewing process that let this through. I am fairly well versed in this area, and so I took the liberty of commenting on this paper. (I must post this in batches.)

    1. This seems to be another case where stuff on the theoretical side passes a review process in a psychology journal that isn't properly integrated into the linguistics literature. The reviewers listed are excellent in their fields, but their fields aren't really this one.

  5. === BEGIN COMMENTS ===
    - computability ::
    + you define E-language ("departing from Chomsky") as the extension of a function, and then I assume that I-language would be the function qua Turing Machine (or RAM, or lambda term, or Abacus, or Post-system, or partial recursive function term, etc).
    + you claim that "as far as we know, no such distinction applies to [...] non-human animals". In light of the previous definition, it seems like you must be claiming that people working on non-human animals only care about the observed behaviour (extension) and not about any intensional characterization thereof. This seems clearly falsified by experiments on animal pattern learning, where people claim that animals (monkeys, birds, etc) have learned certain patterns (usually finite state, sometimes context-free).
    + The formal bit of this section is a poorly presented version of absolutely basic material present in any textbook on formal language theory.
    - induction ::
    + \lambda-definability coincides of course with the class of partial recursive functions.
    + `embroidered', really?
    + the notion of `primitive recursivity' is different from the notion of `partial recursivity'; if you are not mistakenly confusing the two, I do not understand what the `primitive notion of recursion' is.
    + the characterization of recursive functions you give is woefully incomplete (you do not say how to `recursively define' a function in terms of previous ones). A simpler (and correct) way of giving the notion of partial recursivity is as the closure of the set of constant zero function, projection functions, and successor function under the operations of generalized composition, primitive recursion, and minimization functionals. It is hard for a layman to unpack the correct definition from the glib term `recursively define.'
    + your second paragraph is either false or so misleading as to be garbage
    + it is not true that rewrite rules determine successive TM configurations; typically you will have to make multiple TM transitions to simulate one rewrite step.
    + beginning of fourth paragraph. what is it to `represent expliticly the recursiveness'? The definition of `recursive' that we have at this point is Goedels (partial) recursive functions (or, maybe) his primitive recursion functional. Ah, but wait, you define this in the next sentence `recursed [=] returned'. And definitions by induction are defined to be definitions by recursion (it seems again that this is not the primitive recursion functional but rather some intuitive notion). In what sense does a function defined by induction (in your sense) strongly generate something? If you have something concrete in mind, you are simply not expressing it clearly.
    + rest of fourth paragraph. I strongly disagree with this point about weak generation (i.e. strings) being irrelevant for linguistics (the most interesting claims about language universals I know of (Joshi's mild context-sensitivity hypothesis) are made in terms of string sets), however this is not relevant to the main thrust of this article.
    + fifth paragraph. by mathematical induction I assume you mean Peano's axiom to the effect that any set that is closed under successor and contains zero contains all naturals. (Which the reader must infer from the bit about the successor function in the Goedel quote.) Your goal in this paragraph seems to be motivating that the parts of natural languages that we observe should be analysed as finite projections of infinite sets. Your strategy is to show that the formalism of transformational grammar allows you to define infinite sets. I cannot make sense of the analogy to the successor function.

  6. - unboundedness ::
    + This first paragraph is confused.
    + Next paragraph, you want to say that a (partial?) recursive function might have an infinite range, and yet, something which we might want to view as implementing that function might fail to behave as that function predicts on more than a finite number of inputs. This is typical fare for cognitive science (in the vein of Pylyshyn and Marr).
    + next paragraph. this is a conjecture, on your parts, that whatever is responsible for the arbitrary limit on outputs is liftable in a principled way. This may be true for certain deviations from the I-language ideal, but not for others (such as garden paths, which sometimes require training to see). Grammars generating infinite sets are postulated in order to account for the regularities of behaviour (special sciences, ceteris paribus, etc), and these arbitrary limits to account for why the predictions of the grammar aren't borne out. The move to a grammar generating an infinite set is justified to the extent that the combination of the grammar plus limits is together more elegant than a finite list. Therefore, this is both an empirical conjecture about possible behaviour and a suggestion about how to account for this postulated behaviour should it actually be manifest. Thanks to people like Paul Smolensky we know how to implement symbolic algorithms in connectionist nets. Furthermore, work on animal pattern recognition commonly assumes that animals are `in fact' learning patterns (like $A^*B^*$) which in actual fact they would certainly not recognize the longer instances of.
    + last paragraph. I can no longer make sense of this in terms of (partial) recursive functions.
    - recapitulation ::
    + here you make something like a concrete proposal.
    - computability ::
    all things that we can give algorithmic descriptions of are computable, from finite sets to infinte sets. clearly, saying that FLN is computable is not saying very much.
    - definition by induction ::
    you have nowhere defined what you mean this to be. do you mean least fixed point computations? this has nothing to do with `strong generation'. Strong generation is not a well-defined term, and has nothing to do with generating strings vs trees; tree grammars weakly generate sets of trees, and trees can be encoded as strings.
    - mathematical induction ::
    I fail to see what role mathematical induction should play here. I understand the principle of mathematical induction as allowing you to conclude from the facts that zero is in a set, and that that set is closed under successor, that that set contains all natural numbers.
    - caps and gaps ::
    + Here you must be using `recursiveness' in a sense different from meaning `(partial) recursive function'. You are also committing the fallacy of assuming that which you would like to show; `second any limitations on depths of embedding in structures that FLN does generate can only be arbitrary' because you have assumed that FLN only generates infinite sets. Actually, you haven't explicitly said this, but you appeal to it anyways. I personally see no reason to make this claim; no grammar formalism that I am aware of only allows you to define infinite sets.
    + You say one relevant thing in paragraph four, by way of arguing against Everett. This is that even if he is correct in his description of the behaviour(al dispositions) of the Piraha, it would be simpler to describe their behaviour as being underlain by an infinite set. This is an empirical claim, but it is the right kind of argument to make.
    + You assert, `it follows from mathematical law that recursion is unlearnable', yet you cite nothing. In diverse learning paradigms (Gold, PAC, MDL,...) infinite classes of infinite languages are learnable.

  7. - evolution ::
    + paragraph four. Okay, some claims.
    - We need to show the `spontaneous' display of:
    1. Computability
    `proof of a procedure [other than] a look up table'
    2. Definition by induction.
    `outputs must be carried forward and returned as inputs'; `outputs [should be] represented hierarchically'
    3. Mathematical induction.
    `generalization beyond the exposure material'
    - I do not know what `spontaneous' is supposed to mean here; what counts, and why should this be accepted?
    - Note that none of these three terms (in these points) is being used in the way you (tried to) define them earlier.
    - points 1 and 3, viewed together as `the animal should display behavioural creativity' (on analogy with linguistic creativity in its standard chomskyian usage) seem relatively uncontroversial
    - point 2 has two parts.
    1. outputs must be carried forward and returned as inputs
    I can see no motivation for this. This is not a description at Marr's computational level, but rather at the algorithmic one. As others have pointed out, this is either trivial (every computation must do this) or unnecessary (an algorithm using recursion can be rewritten into one without).
    2. outpus should be represented hierarchically
    a string is a unary branching tree. if you want some particular kind of hierarchical structure to be a /big deal/ then you should have written a paper on that. Note that many vision researchers treat the process of image recognition as involving hierarchical structure. I think that it might not be easy to have the one but not the other.

    furthermore, your example of path integration is fundamentally flawed. Let us, instead of representing the outputs as a single vector, represent them as a term over vector space operations (an element of the free vector space). This is a hierarchically structured object (it is a tree). There is also a canonical homomorphism from it to the desired vector. In my view, this is /exactly/ what direct compositionality is about. I imagine that you wouldn't want to claim that FLN is not directly compositional... Given that we have direct compositional approaches to minimalism (shameless plug), this is even more unpalatable.
    - universality ::
    + because it is not clear what you mean by `recursion', I cannot be sure I am responding to what you are trying to get at. But if you mean `(partial) recursive function', then you are wrong; it is not a discovery that language is describable as such a thing -- if it is describable at all, then as this. If you only mean `total recursive function', then there is some content there (not much). But then I would invite you to look into the wealth of work on mathematical linguistics for much more developed, sophisticated, and restrictive proposals.
    - super-sentential syntax ::
    + `unless we have truly compelling evidence that it is not $f_{MERGE}$, we should assume on general grounds of parsimony that it is $f_{MERGE}$.' This does not come for free. You have to argue for this.
    === END COMMENTS ===

  8. Do you think for example that nondeterministic Turing machines, Post systems and deterministic Turing machines are all intensionally equivalent?

    If so then I think I don't understand what "intensional" means in this context; a definition would be helpful. How does intensional equivalence differ from extensional equivalence?

  9. @Greg: Thanks for the detailed comments, I might make another attempt at working through the paper with your evaluation next to it.

    @everybody who has ever been part of the recursion paper war: May I ask why this debate has to be so complicated? The only reason there is a discussion at all is that people like Everett object to the notion that recursion is an integral part of language. So all you have to do is look at the available notions of recursion and show that they hold of language:

    1) computable by some recursive function: true given standard assumptions about the brain being at most a Turing Machine
    2) human languages (when viewed as sets of strings) are recursive languages: true, because no known construction pushes language beyond PMCFLs
    3) recursion as self-embedding: true in various languages even when analyzed in a non-generativist framework; the fact that not all languages may allow for self-embedding is about as interesting as the fact that context-free grammars, which can handle center embedding, can generate regular languages. Just because FLN has a certain amount of power does not mean that this power is instantiated in every language.

    As far as I can tell, that pretty much covers all positions that one could try to falsify empirically, and they all hold.

    Now it is true that if one wants to make precise the original claim that Merge is an essential property of language, one enters more slippery terrain because that requires a precise definition of Merge, some assumptions about what the explanandum is, and a proof that you couldn't have carved things up differently. That's pretty much what Greg points out in his last comment. But that's the second level in the debate (World 2: Merge), and I don't understand why after so many years we are still stuck in the first one (World 1: Recursion).

    1. Point (3), interestingly enough, is basically what Norbert said previously on this blog:

    2. "May I ask why this debate has to be so complicated? The only reason there is a discussion at all is that people like Everett object to the notion that recursion is an integral part of language"

      Part of the 'problem' might be the disrespect displayed in phrases like 'people like Everett'. But the deeper problem lies in the habitual imprecision of work by Chomsky [and some of those who take his work as starting point for their own]. Avery pointed out earlier for how long ambiguities have plagued the debates. And then there is the tiny issue raised by Katz and Postal decades ago: whether human brains can generate languages that have unbounded recursion. So unless you can address this challenge you can not take FL for granted. I understood part of the aim of WHRH to be addressing the Katz&Postal challenge [even though these names are never mentioned - another problem plaguing Chomskyan style debate]. I do not think they succeeded but their trying certainly indicates that the debate has been 'complicated' long before Everett [2005]...

    3. Part of the 'problem' might be the disrespect displayed in phrases like 'people like Everett'
      I don't see what's disrespectful about the 'people like X' construction. It's meant to be read similar to "Even people like you and me could be negatively affected by this change in policy".

      whether human brains can generate languages that have unbounded recursion.
      Maybe I'm just nitpicking your choice of words here, but this has never been the issue. The issue is what the specification of the language module (vulgo grammar) is, not what it can actually generate under the cognitive constraints imposed by the brain. A quick simile: If we cared about the producible output, every program you run on your computer would be finite-state. But if you actually look at the source code, you'll see that hardly any of them are specified this way. And there's good reasons why they are not.

      So unless you can address this challenge you can not take FL for granted.
      I think it's irrelevant for the discussion of recursion whether FL is taken for granted. Even if you view it as just a convenient shorthand for some domain-general procedure the claim that recursion is an essential part of it does not suddenly turn into gibberish and is still valid under any of the three interpretations above. Now of course CHF made the stronger assumption that FLN is an independent module, but that is an independent issue.

      Anyways, my basic point was that the whole recursion debate has always seemed like a non-issue to me. The claim (viewed independently of the additional cognitive and evolutionary assumptions of CHF) is rather weak no matter how it is interpreted, and the empirical counterarguments --- while certainly interesting on a descriptive level --- are perfectly compatible with it under every interpretation. So I just don't understand why this has been such a hot button issue for over 10 years now, for both sides of the debate.

    4. @Alex: yes, point 3 is not some ingenious insight of mine, it's pretty much the standard notion of universals in the Chomskyan sense. Which is why I find it so puzzling that we didn't just politely point that out after the first Everett paper and go back to business as usual.

      I would be less surprised if people objected to the idea that FLN is only recursion, that is quite a strong claim --- even if we equate recursion with Merge, and subsume Move under Merge, that still leaves a lot of stuff that needs to be done away with or derived from FLN-external factors. But instead the debate is about whether recursion is at all part of FLN, for reasons I cannot fathom.

    5. @Thomas: Completely agree. From a sociological point of view, I think part of the problem may be the strong temptation to respond to Everett's argument by showing that his factual claims about Pirahã syntax are wrong. This is especially tempting given that some of these claims seem rather incredible. However, as you say, this has no bearing on whether recursion is part of FLN.

    6. @christina Just to expand on some of what Thomas said. I think the question of whether there is a cognitive module devoted to language is irrelevant to the (daily life of the) working minimalist syntactician. As far as I can tell, nothing that linguists *actually* do would change if we discovered tomorrow that the linguistic system just *is* the vision system. Except of course, that there would suddenly be a whole lot of interesting new work on how to reduce the linguistic concepts to the ones from vision, or vice versa, in just the same way that there are a lot of interesting questions on how to deal with phonetics in the manual modality (sign language). But the papers in LI, NLLT, or the like would be the same. In other words: this is a point of rhetoric that plays no role in linguistics as she is done. It is, of course, still an interesting question.

      As for Katz&Postal, I think that the real issue is `what is a grammar', and `what is its relation to the parser/generator'. I think that there are two ideas about this latter floating about (if I were Fodor I would dub them the RIGHT and the WRONG views). The first is that the grammar is simply an abstract description of regularities in the behaviour of the parser (this is the levels interpretation from Pylyshyn, Marr, Poggio, etc). According to this view, there are not two ontologically distinct things, the grammar and the parser, but just one, described at different degrees of abstraction. The other view is that the grammar functions as some sort of knowledge base, which the parser is able to query. Here, there are two ontologically distinct entities. Devitt brutally savages this latter view, and is at least less ill disposed to the former. My problems with the second is that it is just not clear anymore what explanatory role a grammar serves.

    7. @Greg, thank you for the comments. I take your word for it that "the question of whether there is a cognitive module devoted to language is irrelevant to the (daily life of the) working minimalist syntactician" To be perfectly honest, I have suspected this for some time in spite of the ongoing proclamations by Chomsky [and the host of this blog] that minimalists attempt to uncover the BIOLOGY of language. As [former] biologist I just never recognized anything resembling the work of a biologist in what minimalists do.

      However, the purpose of Norbert's blog is to educate about the true aims of minimalists. And i fear he'll disagree with you that "nothing ... would change if we discovered tomorrow that the linguistic system just *is* the vision system.". Here is a nice quote from his first blog:
      "This blog ... will partly be a labor of hate; aimed squarely at the myriad distortions and misunderstandings about the generative enterprise initiated by Chomsky in the mid 1950s. There is a common view, expressed in the Chronicle article, that Chomsky’s basic views about the nature of Universal Grammar are hard to pin down and that he is evasive (and maybe slightly dishonest) when asked to specify what he means by Universal Grammar (henceforth I’ll stick to the shorter ‘UG’ for ‘Universal Grammar’). This is poodle poop!

      The basic idea is simple and has not changed: Just as fish are built to swim and birds to fly humans are build to talk. Call the faculty responsible for this ability ‘the Faculty of Language,’ (FL for short). The aim of the generative enterprise is to describe the fine structure of FL.

      If nothing would be at stake for syntacticians if Chomsky's BIOlinguistic view turned out to be wrong I doubt anyone would engage in a labour of hate to defend that view. And IF one takes biolinguistics even in the slightest seriously, then the hardware/implementation does matter, then the task of the biolinguist would be to discover what is already there [not speculate confusedly about what different types of computations might or might not generate]. As you point out in your multipart earlier post: WHRH is teeming with confusion and conflation re recursion. On top of that it is lacking any meaningful reference to biology...

      re Katz/Postal just one very brief comment: I am here only interested in their criticism of the internally incoherent ontology of Chomsky's biolinguistics. This criticism is entirely independent of them being right about their own ontology.

    8. I'm a syntactician (tho not a minimalist, atm at least) and agree 100% with Greg's position (and find Norbert's views to be interesting and in some areas even plausible conclusions, but completely unnecessary and therefore wrong as assumptions).

    9. @Avery: in case this was not clear; I do NOT disagree with Greg's position. Au contraire, I think he is absolutely right re the irrelevance of 'biological foundations' for the actual work of the syntactician. I believe this is even the case for those who [unlike you] claim to work on the biology of language. And I am not alone with this belief: "[Chomsky's] ontology is evidently so awful that even he pays no attention to it when actually considering real linguistic matters." [Postal, 2009, 257] - for details see:

      So I was merely reminding Greg that the aim of this blog is at odds with what syntacticians actually do. Further, it seems that Norbert is incorrect to claim that "The aim of the generative enterprise is to describe the fine structure of FL" - if FL is a biological organ. From what i can tell generativists do not even attempt to describe any fine biological structures [and the miracle invocation in the 'evolutionary account' is awfully close to ID claims]. If FL is NOT a biological organ but a 'type of abstract Turing machine' then it would seem much of the hostility towards 'people like Everett or Levinson' is pointless. Maybe the botched WHRH paper would be a good starting point for re-thinking WHAT the aim of the generative enterprise is and, if [as you and Greg say] biology does not play any role in the work of generativists - then all this misleading talk about biology ought to be eliminated.

    10. @Christina: I think that the entire talk of language as a domain specific or domain general phenomenon, which has been advocated for by generative linguists from the get-go, is a red herring in terms of actual practice. Therefore, it is not this blog, nor minimalism, from which this emanates.
      However, I would like to underscore my fundamental agreement with what I take to be Chomsky's identification of the three main empirical phenomena of linguistic interest:
      1) "Descriptive adequacy" (my trans.: accounting for linguistic behaviour (usual caveats for special sciences, ceteris paribus, dispositions, etc))
      2) "Explanatory adequacy" (my trans.: accounting for language acquisition i.e. the fact that our linguistic behavioural dispositions are learned)
      3) "Beyond explanatory adequacy" (my trans.: accounting for the fact that humans are able to do this stuff while phylogenetic neighbors cannot)

      My main point is just that linguists are working on point 1, and that the disputes about whether point 2 should be dealt with by domain general or domain specific mechanisms is therefore largely irrelevant to this work which is actually taking place.

      About `biological organs'. What does this even mean. I know of its use in Chomsky's work with respect to language acquisition, likening the development of language to the development of kidneys (i.e. not inductive inference). If this is what is meant, I think it is not the most interesting hypothesis (the most interesting hypothesis being that language acquisition is a form of inductive inference, which Alex C et al are working on).
      But clearly, if one adopts some form of materialism, our descriptions of language (behaviour(al dispositions)) must have some relation to the brain. The classical cognitive science position (e.g. Marr) is that cognitive faculties must be described at three levels, the computational level, the algorithmic level, and the implementational level. (Peacocke would add a level between computational and algorithmic.) Therefore, it is not incorrect to assert that "the aim of the generative enterprise is to describe the fine structure of FL." It's just that we are concentrating our work on the most abstract level, first. (As suggested by Marr.)

    11. Thank you for the comments, Greg. You are certainly right: this blog did not invent talk about language as domain specific phenomenon. But it certainly defends this view [unless Norbert had a pretty dramatic change of mind]. Now it would seem that your "1" focusses on performance while I understand Chomsky [and Norbert] to say we ought to focus on competence. But, again you're right the actual work done is not work on brains but work on what comes out of people's mouths [or is signalled in sign language]. One may of course be able to draw SOME conclusions about innate structures that might generate the linguistic behaviour. But I think the current state of the art hardly justifies the term 'fine structure of FL' as object of research.

      You say "About `biological organs'. What does this even mean." - that is an excellent point: after allegedly doing 60+ years of research on the BIOLOGY of language [as claimed by Chomsky, 2012, p. 21] it is rather odd that we know virtually nothing about this postulated biological organ. So whether or not there is the analogy to Marr you suggest is an open question. If someone like Everett got things right then language is not the kind of monolithic cognitive faculty to which the Marr levels apply.

      Now if one takes the view Postal does [that languages are abstract objects] one can afford to work just on the abstract level and leave it to psychologists to figure out what is going on when humans come to know these abstract objects. But if one claims to be a biolinguist one cannot ignore the implementational level. One may of course have some division of labour - so that people like yourself focus on the computational level. But to have NO ONE working on the implementational level is quite unusual for the biological sciences...

      One should not forget that Marr's work [which eventually lead to the model of visual processing] was based on anatomical and physiological data. Once the neural correlates were established it became possible to 'abstract away' from them to reach a deeper level of understanding. But as far as i know Chomskyan biolinguists have never established any neural I-language structures - so there is nothing to 'abstract away from'. Chomsky himself warns against radical abstracting:
      "[Connectionists are] abstracting radically from the physical reality, and who knows if the abstractions are going in the right direction?" [Chomsky, 2012, 67] - INDEED, how do we know the biolinguistic abstractions are going in the right direction?

    12. @Christina: My 1-3 were intended as explicanda; they are the objects of study. The competence-performance distinction is a hypothesis to the effect that the best (or at least a good) way to account for the data is by postulating an abstract rule-governed domain, which underlies the actual behaviour. What could linguistics possibly be about if not behaviour (broadly construed)? Nothing else is observable.

      It is of course a hypothesis that this approach will be successful. (Marr himself (1977) expresses doubts that language has a type 1 explanation.) Like all things, only time will tell. I think it suggestive that we have been able to describe so much using this approach. I am happy with Everett pursuing his ideas. How interesting would it be if he were right! I will happily change my mind about the relative promise of these two approaches if presented with reasons to. I think the Chomsky et al paper on recursion is of low quality, the Hornstein et al one even worse, and I find it terrible that this kind of crap is the public face of linguistics. Like Thomas has said, this is a non-issue; Everett's conclusions pose absolutely no `threat' to the ideas of generative grammar. He is to be forgiven, of course, for taking Chomsky at his word.

      I do not understand Postal's view. (Or Katz'.) Is this a form of dualism? If Postal agrees that he is ultimately engaged in task 1, then I am happy with him talking about it in any way he would like. If he thinks that he is not ultimately describing behaviour(al dispositions), what is his object of study? If there is no data that he is responsible for, then what is he doing? There is in fact interesting theoretical work on linking competence type theories to rich behavioural data (including neural data). I'm thinking in particular of John Hale's Surprisal and Entropy Reduction, Smolensky's tensor products, and beim Graben's dynamic cognitive modeling.

      You will not get very far towards understanding a computer by looking at the sequence of states (registers plus stack) it is passing through. It is possible that the brain is different (in that it is easier to understand what is happening by looking at anatomy). Certainly linguists seem to understand language (in the sense of why sentences mean what they do) better than do psychologists/neurologists. Marr in his book advocates proceeding top-down. Whether this is the benefit of hindsight, or poor memory, I cannot say.

      As for biolinguistics, well, I don't know just what you mean by this. Since Chomsky introduced the minimalist program, and task 3, many have jumped on the bandwagon. I think task 3 is a valid and important endeavour. I have no idea how to begin addressing it. (Chomsky himself warns that it may be too early to begin doing so.) What do you think of as the `biolinguistic abstractions'? And what kind of epistemic guarantee are you looking for?

    13. @Greg; Thanks for the interesting comments; though i find some of them a bit puzzling. Most of all that you seem to be unaware of Postal's work [at least this is what your question indicates]. I think this speaks volume's for the power Chomsky has over your field: voices that are critical of him are silenced, even when, as in this case, the voice belongs to one of the most innovative syntacticians [you may want to have a look at Paul's 2011 Edge based clausal syntax: probably more rewarding than recursion papers... ]

      Postal is not describing behavioural dispositions any more than a mathematician studying analytic number theory or a logician studying propositional calculus are describing the behavioural disposition of individuals engaged in solving equations or logical puzzles. Does this mean Postal is committed to some form of dualism? You'd have to ask him. He probably is but my guess is you might be as well. If sentences are physical objects of some kind then you should be able to tell me where the sentence

      [1] "Many linguists are confused about recursion." is located.

      Is it in your brain, on the screen of your computer, on MY computer, in my brain...? We can agree that physical tokens of the sentence are located in those places. But, unless you're a hard core nominalist you probably also agree that there is an abstract type of this sentence that is not located in any of these places. Postal is a realist about sentences [he believes they do exist outside the physical realm] - you may prefer fictionalism...

      As far as behavioural dispositions are concerned; some people may be disposed to giggle when they hear [1], others might be disposed to protest etc. - this hardly tells us any more about its syntactic properties than me looking at the sequence of states (registers plus stack) a computer is passing through tells me about how the computer works.

      You say "The competence-performance distinction is a hypothesis to the effect that the best (or at least a good) way to account for the data is by postulating an abstract rule-governed domain, which underlies the actual behaviour." - as far as I know Chomsky locates competence in human brains [I-language]. But there is nothing abstract about brains, they are concrete objects. So where then is this abstract domain? I happen to agree with you that when we want to learn about syntactic relations we're neither interested in brain states nor in behavioural dispositions but in an abstract rule system, in relations that hold between abstract objects. Competence may explain how we can KNOW the abstract rule system but it does not explain the rule system itself.

      As for biolinguistics - I am not really the person to ask - I don't have the foggiest idea how biology enters into linguistics. If you read some of the earlier discussions on this wonderful blog you'll notice that I have begged Norbert more than once to tell us more about the biology of linguistics. But I was told knowing biology is above the pay grade of this leading biolinguist.

      Re the quality of the papers you mention: you must be aware of the party-line about the Hauser et al. paper; " “Most of the linguistics was excised from Hauser, Chomsky and Fitch, for example, at the insistence of the journal” (Pesetsky, 2013a, slide 107). I predict we will hear something similar about the Watumull et al. paper fairly soon...

    14. @Christina: I apologize for having been so unclear.
      I am well-aware of Postal's work, and think that he is an extremely capable linguist. I just don't understand his philosophy.

      Sentence types are not physical objects, sentence tokens are. I am not committed to a particular position on abstract objects, although I am not partial to realism. Luckily, I am not trying to describe a set of sentence types. I am trying to describe how people use language, and how they learn to use language. Use of language ultimately boils down to people's behaviour, but, because behaviour is complicated, it's convenient (necessary) to talk about behavioural dispositions. I keep mentioning David Marr because I think that he and his clarified a mode of investigating this stuff which doesn't force us to strange and problematic views about `Knowledge Of Language,' which you seem to be thinking of in a different way. (In the WRONG way, to use the labels from my previous post.) In other words, we do not KNOW an abstract rule system; an abstract rule system is implemented in our brain/mind. Chomsky is confusing to read on this issue, because he never lets himself be pinned down. Many other linguists seem to think what you seem to think they think.

      Biology enters into the study of how people use language, how they learn to use language, and how they evolved to learn to use language, because people are biological organisms. Clearly, a complete they of linguistics should ultimately be consistent with a complete theory of biology. I understand biolinguistics as saying `hey, let's start integrating these two fields now.' I think that this work is currently filled with much hand waving, but maybe that's a precursor to more serious stuff.

    15. Another possible way of putting one aspect of the above is that the grammar can be seen as a *description* of an aspect of the brain, much as an evolutionory tree can be seen as a description of the history of a family of lizards; both the evolutionary tree and the grammar are abstract objects, and oversimplifications of the reality, but are nevertheless useful for certain purposes, and have some kind of claim to partial truth (and both kinds of things tend to get revised rather often, as new information is acquired and old information rethought).

      There is however a difference in degree, in that the evolutionary trees are based on much broader, deeper and better established scientific principles; the difference is degree is so great that I don't sneer at people who think it's ridiculous to call grammar a branch of biology, but, I claim that it is in the end only a difference in degree. What Postal and Katz did wrong is confuse a means of study with the object of study.

    16. @Greg, when you say that linguistics is about behavior, do mean to say that the actual object of study is a certain set of behavioral dispositions? Or would you agree with Chomsky that the actual object of study is some mental faculty (whether domain-specific or not)?

    17. What is the difference though between a mental faculty and an appropriate set of behavioral dispositions? For me the word "faculty" just means some inherent power or disposition.

      Obviously given that we are all materialists (apart from perhaps Christina) any disposition to behavior must be rooted in some neural structures.

    18. @Alex C: I have not given up entirely on materialism. But maybe you or Greg can clarify what the difference is between behaviours and behavioural dispositions. When Greg said "What could linguistics possibly be about if not behaviour (broadly construed)? Nothing else is observable." I understood him to mean behaviour broadly construed includes behavioural dispositions. If this is not the case and dispositions are 'rooted in some neural structures' there does not seem much that the linguist can observe at the moment [given the current SOTA in neuro/brain science]. Also, at least according to Greg, linguistics seems clearly not to be about brains states [something we agree on].

      @Avery: Can you direct us to some publication in which "Postal and Katz ... confuse a means of study with the object of study"? i find this a very implausible claim given that these two go through great pains to show that Chomsky habitually conflates knowledge of language and language [e.g. here: and here ] According to K&P these are two distinct objects of study just as mathematics and knowledge of mathematics are a different objects of study. Further, when you say "evolutionary tree and the grammar are abstract objects" you seem to accept implicitly Postal's platonism [unless you use 'abstract object' in a rather misleading way].

      @Greg: apologies for misunderstanding your comment about Postal. I do not think Postal's philosophy is difficult to understand [the two papers linked to above are pretty accessible]. Now whether or not one wants to ACCEPT Postal's ontology is of course a different matter. But I think before rejecting it one ought at least to know what it is. I agree that for much actual linguistic work the ontological questions play little if any role and I am not committed to any particular way of thinking about knowledge of language [other than that is is located in brains of the knowers [or possibly in some computers] I am unaware of truly convincing arguments for domain specificity but remain open minded about the possibility]. I do think, however, that Chomsky's view about the relationship between language and knowledge of language is either false [if one takes seriously what he writes] or very misleading [if he intends part of his writings to be taken 'metaphorically' without ever specifying where the metaphors begin]

    19. Please let's not let this thread drift into being about Platonism in linguistics again. We have discussed this topic at length on other threads.
      The metaphysical questions about the ontological status of abstracta are not relevant.

    20. @Alex C: Is a sentence such as "traces must be properly governed" synonymous with some statement regarding behavioral dispositions? The arguments against that sort of verificationist position are presumably familiar. (As Chomsky puts it, we don't think of physics as a "science of meter readings".)

      This is has nothing to do with whether or not materialism is true. The majority of the evidence will be behavioral regardless.

    21. @Alex D: These are difficult issues. I think that part of the problem lies in what `about' means. In one sense, physics is absolutely `about' meter readings (etc), or perhaps better, it is about objects and their motion. Chomsky has long held (as far as I remember) that it makes sense to adopt a realist stance to one's best current theory of some domain. Once we have such a theory, and we provisionally adopt some form of realism wrt it, we can reasonably say things like `theoretical object X is what the theory is really about'. But, what is `theoretical object X', really? It is something we have postulated in order to get a better handle on the data. I think this kind of talk is harmless, as long as we keep in mind that we are engaging in a sort of facon de parler, and we only interact with people who are making the same assumptions. But we don't, and so it isn't.

      @Christina: I feel like you're putting words in my mouth! I didn't reject the Katz/Postal theory on this blog, I said that I don't understand it. I think it is good philosophical practice to understand a claim in the most charitable way possible. I cannot figure out how to understand the Katz/Postal theory in that way, and so I think I must not understand it.

      I think Avery's last sentence was a beautiful summary of my poor understanding of the K/P view; they seem to be confusing the computational level theory of a phenomenon with the object of study itself. I think we are all using `abstract object' in a way that you think is misleading.

    22. @Greg: As you say, these are difficult issues. There is probably no point in debating the philosophy of science here but I think it is interesting to flesh out the different positions.

      Wrt the meaning of 'about', the question I was trying to get at is whether the content of a scientific theory is exhausted by its empirical commitments. Or to put it another way, is a theory anything more than a concise notation for a set of observation sentences? Personally, I would reject that sort of verificationist position, but I am not attempting to argue the point here.

      As far as realism is concerned, I wonder if we could distinguish two different realist stances. One is realism about the faculty of language itself. I.e., the position that there is some component of the mind/brain which underlies our linguistic abilities. To my mind that kind of realism is harmless and sensible. It is roughly analogous to the position that there are (really) objects in motion for our theories to describe. A stronger realist stance is realism with respect to some particular linguistic theory. E.g., the position that the Barriers theory is a true description of some aspect of FL. It is much harder to justify realism of that sort (especially in a young science like generative linguistics).

    23. @Alex C: I have no intention to redo the Platonism debate, I merely answered a few questions and provided a couple of links. I did ask you a question though re the status of behavioural dispositions [we all agree they are no platonic objects] and Alex D's comment shows that I am not the only one concerned about the seeming equation of behavioural dispositions and properties of sentences. It seems we are looking potentially at 2 rather different research projects: one [1] is part of psychology: why do kids behave in certain ways when they learn English [the u-shaped learning curve for irregular verbs comes to mind]. The other [2] attempts [among other things] to figure out why sentences of English have the properties they do. Now knowing results from [2] is certainly helpful for anyone engaged in [1]. But there seems no a priori reason to assume that someone engaged in [2] will benefit from studying behavioural dispositions. So if you could clarify that for us it would be much appreciated.

      @Greg: I apologize for using a misleading formulation: I meant to make a general statement [one ought to know what one rejects] not a statement directed at you personally. So if the shoe does not fit please do not wear it. I have asked Avery for a citation that can substantiate what he attributes to Katz&Postal. I have never read anything by them that would support such an interpretation but of course i could have overlooked something.

    24. E.g., the position that the Barriers theory is a true description of some aspect of FL..
      I'm not a huge fan of philosophy of science (I always found SSK more insightful), but this is actually an interesting issue for linguistics: what does it mean to be a true description of FL?

      In contrast to physics, linguistics has a strong commitment to how a theory is stated. In SPE, for example, the rule notation itself played a role in assessing markedness and learnability. That's very different from, say, physics, where it really doesn't matter how you write a formula as long as it gets the right results and is easy to use. To some extent this is reasonable because linguistics has a mentalist commitment and physics obviously doesn't. But it means that this second, stronger position you stake out is actually a continuum that ranges from "getting the tree structures right" to "getting the derivations right" to something like "getting the descriptive complexity right", and maybe something even more radical. My hunch is that most syntacticians are fairly close to the "notationist" stance (not so sure about phonologists).

      This might even partially explain why there is less interest in truly theoretical work, as Norbert laments. If there's only one true theory, then that rules out one rich area of theoretical inquiry: characterizing an empirical issue in as many ways as possible.

    25. @Christina: could you give me an example of a recent paper that best exemplifies your program [2] that is not Platonist?

    26. @Thomas: I'm not sure that I see that much of a distinction between physics and linguistics in this respect. Notation appears to be more important in linguistics because we're dealing with it at two separate levels: we have theories of the "notation" of certain mental representations, and then we have the notation which we use to write down those theories. With regard to the latter, I'm not sure that we care any more than physicists. (And don't physicists sometimes care a little bit?)

    27. Say we have a CFG (I know ... but humour me) G and we binarise it left to right to get a CFG G'. Are these notational variants of each other? Could one be a true theory of English and the other not?

      (for CFG read well-nested MCFG say)

    28. @Alex C (if this was directed at me, sorry for butting in if not). I don't know. In general, I don't know how to determine whether or not two theories are notational variants. Sometimes there are clear cases.

    29. @AlexD; yes it was. Just amplifying Thomas's interesting observation which is something that I have recently become very interested in from the strong learning perspective -- how close do you need to get to a grammar before it is just a notational variant, before it is sufficiently close to be "strongly equivalent".

    30. @Alex D: If you're just thinking whether, say, binary features have values + and - or 0 and 1, then no, linguists do not care. But just switching from a privative to a binary feature system is considered a huge deal, even if the two systems do the same work. The idea being that the binary feature system could in principle do more work (two feature values + absence of a feature), so it should be avoided unless necessary. Similarly, binarization of multi-valued feature systems is considered more than just an arbitrary change in notation. I think the SPE example I gave is also a clear-cut case where notation plays a huge role --- if you had different notational devices, certain rules would be easier to write and thus you would wind up making different empirical predictions.

      I can't speak authoritatively about physics, of course, but my impression is that physics is closer to math in this respect: if two theories do the same thing, use whichever one is better suited to the task at hand. For instance, after Dyson proved Feynman's account of QED equivalent to Schwinger's, people didn't debate which one was "truer", they just started using Feynman's because it was more intuitive (Disclaimer: I might be completely wrong about this, I read it in a Feynman biography a couple years ago).

      @Alex C: If you care about structural descriptions, as linguists do, then the G and G' are clearly not notational variants. And if you take the split between E-language and I-language seriously, even generating the same tree languages does not make two grammars notational equivalents (for instance two MGs that generate the same tree language but do so in a different way). I'm not even sure if having the same derivations is enough to count as notational variants for linguists, exactly because they care about things like your choice of feature system.

    31. @Alex C: Yes, that is a difficult question. However, strong equivalence is a relation between grammars, whereas realism in this context has to do with a relation between a theory (of a grammar) and the mind/brain. Trying to figure out whether a realist is thereby committed to any strong equivalence claim is making my head hurt! It may be that a realist could state his realist commitments without making use of the notion of strong equivalence at all.

      (Just to clarify, I am genuinely unsure as to whether realism with regard to current syntactic theories is justified.)

    32. @Thomas: The examples you give are cases where we are comparing different theories of the format of certain mental representations. I.e., it is not just a question of how we write things down but how the mind/brain "writes them down". So of course I would agree that these differences are not arbitrary. Again, the curious thing about linguistics is that we're dealing with theories of mental representations which themselves have a particular format or notation. So when we talk about "notation" it's important to clarify whether we're talking about the notation of these representations of the notation for our theories of them. Once we get this straight, I think we'll find that linguists are roughly as interested or uninterested in notational differences as people in other sciences.

    33. @Alex C: "Christina: could you give me an example of a recent paper that best exemplifies your program [2] that is not Platonist?"

      I am not sure if this is a trick question? Are you implying ALL work on what the lay-person would call linguistics is Platonist?

      I am making no claims re best but suggest for example the recent paper by Chris Collins and Ed Stabler "A Formalization of Minimalist Syntax' [available here: ] falls under what i called [2] above. As far as I know C&S are no Platonists.

      I did a word search and could find no reference to behavioural dispositions. I am no minimalist so i do not know whether those who are agree with C&S. But it would seem when evaluating C&S's paper behavioural dispositions play no role

    34. @Alex D: but that's assuming that "notation of these representations" actually means anything, and that there is a way of measuring how close your notation is to this representation-notation. Porting this into mathematics, for example, amounts to whether the order-theoretic or the algebraic view of lattices is a better representation of lattices. And there simply isn't a way of answering this because the question is confused.

      And just to make sure that nobody thinks I'm taking some kind of empiricist anti-FL position here: it is perfectly reasonable to assume that FL is real, but that there is no unique description of it. And all I wanted to point out is that theory X is the true description is often construed as theory X is the only true description, with a very narrow range as to what counts as notational variants. And I think this actually has implications regarding what kind of results linguists are interested in.

    35. @Thomas: Yes, I am a realist about mental representations and I think there are facts of the matter regarding how they're constituted. I don't assume that there are necessarily unique correct descriptions of these facts, but that's not relevant to my point. We still need to distinguish theories about how certain mental representations are formatted from the format of the theories themselves. If we don't do this then we risk mistaking substantive differences between theories for merely notational ones.

    36. We still need to distinguish theories about how certain mental representations are formatted from the format of the theories themselves. If we don't do this then we risk mistaking substantive differences between theories for merely notational ones.
      Could you give an example of this? It doesn't have to be from linguistics --- physics/chemistry/CS/math analogies are fine, too. I just have a hard time imagining a scenario where an equivalence proof could create this problem.

    37. @Thomas: Let's take your example of binary vs multi-valued features systems. If I write down my syntactic theory using binary-valued features, then I may or may not be intending to commit myself to the hypothesis that FL uses binary-valued features. I.e., the use of binary-valued features could be an arbitrary notational choice that I made in the course of writing down my theory, or it could be a claim that the relevant mental representations really do have that format. Either interpretation is fine in principal, it's just important to distinguish them. Once we do so, I think it's clear that linguists have no special concern for the notation of our theories. What we do have a special concern for is the "notation" of certain mental representations, since that's the subject matter of the discipline!

    38. @Alex D: I think that, normally, the empiricist vs rationalist leanings of scientists are irrelevant. But in linguistics as she is practiced I think there are some `empirical' differences. Both rationalists and empiricists are happy to engage in heavy data manipulation (multiple center embeddings are god-awful by any measure), so as to allow for a more elegant theory.
      When I do this, I assume provisionally that this phenomenon can be given a nice explanation by factoring it into two parts. But no matter how elegant it is, my first part is only good to the extent that the second part can be fleshed out and that together they explain the explicandum. What you often see in mainstream generative grammar is `I don't need to account for this datum, it is not part of core grammar'. There is no qualification, no awareness that this conception of core grammar is only good in so far as it is explanatorily virtuous. And outsiders see this and (justifiably) take issue with it.

      As for your two realist stances, I do not understand the first. What are our linguistic abilities? Do you mean what I am clumsily describing as behavioural dispositions? If so, I am happy to do this. But note that this is just acceding to materialism about the mind. The other possibility is that you are asking whether it is harmless to assume that our linguistic abilities have a type I theory in Marr's sense. I think that's totally reasonable, as a working hypothesis, but it is only a hypothesis about how best to account for the phenomena.

      @Christina: You will find talk in most of chomsky's work about `use of language' or `language use'. My locution `behavioural dispositions' is intended to be synonymous therewith (or a rational reconstruction thereof). Why does Chomsky talk about this but not anyone else? Because Chomsky's work is situating the inquiry, and linguists are working out a particular approach to this which has factored the problem of accounting for these into what we call competence and performance.

    39. Okay, then the question is why we should attribute so much importance to something so minor like the choice of feature system. There's infinitely many different ways of getting the dependencies you want --- some of them just as succinct as whatever proposal you might be advocating --- and no effective way of distinguishing between them that doesn't depend on numerous ancillary assumptions.

      So why make the notational commitment in the first place, rather than focusing on the underlying property you want to capture and stating it as clearly as possible, without relying on notation? That's still contributing to the goal of carving out the class of mental representations, but in a more flexible and sustainable way.

      To give a math analogy, it doesn't matter if you represent numbers in binary, decimal or hex, even if you want to state a condition like "the sum of the digits in decimal representation is greater than 13". You can find equivalent statements for the other number systems, so instead of insisting that decimal is the right notation for numbers, state the principle in a universal fashion that takes the base of your number system as a parameter to get the right result.

    40. @Thomas: I wasn't suggesting that we should attribute much importance to the choice of feature system; it was just an example to illustrate the two levels at which issues of notation arise. I agree that in our present state of knowledge we can only find direct empirical support for very abstract claims about the properties of linguistic representations.

    41. As for your two realist stances, I do not understand the first. What are our linguistic abilities? Do you mean what I am clumsily describing as behavioural dispositions? If so, I am happy to do this. But note that this is just acceding to materialism about the mind.

      Our linguistic abilities are the abilities we have in virtue of our linguistic knowledge. I.e., the abilities may come first epistemologically but not ontologically. I suspect this is a point where we would disagree.

      I’m still unsure what materialism has to do with this. Whether or not materialism is true, our behavior results from the interaction between our minds and the external world.

    42. the abilities may come first epistemologically but not ontologically
      The abilities are the only things we have any epistemological access to. At the end of the day, these are the things that our theory is evaluated against. I think that we are (ultimately) responsible for all of the data. If we decide to say that some of the data is someone else's responsibility, we are not done until the buck has come to rest. If you agree with this, then I think that there should be no discernible difference in our practice.

    43. @Christina far above: I'd take Katz & Postal 1991 and many of the other references in your 2013 Biolinguistic Platonism paper as an example of the error of confusing the model a.k.a. tool of study with the subject matter of study (this is not entirely original with me, being an application to a somewhat different are of a comment that Steve Anderson made about a term paper I wrote as an undergraduate). Postal and Langendoen _Vastness of Natural Language_ would be another.

    44. Continuation of above: so, according to me, an evolutionary tree, a grammar, and the formal language that the grammar weakly or strongly generates are all abstract objects, used in the latter two cases to study the mental structures and behaviors that speakers of languages appear to have and produce.

    45. Re notation: Far above, Greg wrote:

      1) "Descriptive adequacy" (my trans.: accounting for linguistic behaviour (usual caveats for special sciences, ceteris paribus, dispositions, etc))
      2) "Explanatory adequacy" (my trans.: accounting for language acquisition i.e. the fact that our linguistic behavioural dispositions are learned)
      3) "Beyond explanatory adequacy" (my trans.: accounting for the fact that humans are able to do this stuff while phylogenetic neighbors cannot)

      My main point is just that linguists are working on point 1, and that the disputes about whether point 2 should be dealt with by domain general or domain specific mechanisms is therefore largely irrelevant to this work which is actually taking place.

      Linguists are actually, I believe, working on both levels 1 and 2, and the focus on notation comes from Chomsky's 'proto-Bayesian' use of the evaluation metric to address 2 in Syntactic Structures. So the idea is to tune the notations so that recurrent forms of generalizations will be easy to express relative to other possible things that don't occur often or at all (chapter 8 in Sound Pattern of English is perhaps as close as it gets to a clear presentation of the idea).

      So for example the choice between equipollent binary and univalent theories of features has various subtle consequences for what kind of linguistic systems we expect to find.

      Or, for another example, suppose that self-similar embedding did not exist in performance, so that all languages were like Piriha (except possibly having binary corelative clause and similar constructions). You could still make a linguistic argument for phrase structure as opposed to finite state grammars on the basis that the PSG notation favors grammars where the same form of subsequence, such as (Det) Adj* N, appears in multiple positions (somewhere between 5 and 20 would be my guess, for a basic not too horrible FSG for nonrecursive English).

      With the FSG, this is an accident, because the sequences could be different in various ways in all the positions, say, (Det) Adj* N in subject position, Adj Det N in object (Kleene star left out on purpose, 'subject' and 'object' terms used descriptively, not theoretically), whereas the PSG for this conceptually possible but empirically unlikely variant of English would be substantially longer/less probable according the the prior than the actual one.

      If we allow edge-embedding, the results are still with in the (weak) capacity of FSGs, but the argument for PSGs get stronger because the required 'covering' FSGs are much longer, with far more accidental repetitions.

      So, to implement Chomsky's original idea as I understand it, equivalent linguistic notations need to have an evaluation-metric as well as language-preserving mapping between them (something that people never discuss afaik),

    46. And then there is the problem of structure/strong generative capacity.

      This has always been a complete and utter mess because the structures have never been anything other than an intuitively motivated but not coherently justified system of equivalence classes over the derivations; many linguists tend to think of them as having the job of 'supporting' semantic interpretation, but that's surely too vague to do anything mathematical with. Once upon a time, it was thought that people had intuitions about phrase-structure bracketings and transformational relations that could be used as evidence, but the former have never produced useful results in unclear cases such as the correct bracketting of 'far into Russia' (right? or left?), and the latter (John left vs did John leave) are presumably semantic relationships, so that idea was quietly abandoned in the early seventies, iirc., leaving the notion of strong generative capacity with a verbal existence but no useful referent in practice for theory comparison.

      My suggestion for 'strong generative capacity' is that it be replaced with 'compositional semantic capacity' in a type-theoretical framework, where we assume a system of semantic types such as 'e' for entity, 'p' for proposition, e->e->p for transitive verb, etc. and an assignment of types for lexical items, so that the grammar is responsible for constraining composition, as well as supplying any extra 'glue' material that is needed (such as a conjunction operator for iterated adjectives).

      There are surely problems with that, but what coherent alternatives are there?

    47. @Avery: I agree that linguists say that they are also working on (2), and that some theoretical claims are motivated by rhetoric about learning. Given as linguists typically do not know much about theories of inductive inference (or other kinds of learning), I feel that it best reflects reality to say that they are not really working on (2).

      As for SGC, the data we want to describe are, minimally, sound meaning pairings (but also probably prosody, psycholinguistic stuff, etc). So, we can ask what kind of relations a grammar can define. In formal language theory, we have studied this under the term `bimorphism', but not nearly as much as other stuff.

    48. I would still say they are addressing (2), at a Marr 'computational' rather than 'algorithmic' level (and arguably in a somewhat confused and incoherent manner, due to the deficiencies in concept) and near unintelligibility of exposition of the evaluation metric (essentially fixable by Bayes, I suggest).

      What they don't do is finding learning algorithms, but the basic Marr idea is that you do that later. Learning algorithms are in any event a very different skillset, but what they do will be heavily influenced by the nature of the hypothesis space they are operating on.

      & formally inclined linguists have to do computational (2), unless they want to limit themselves to analysis within preexisting formalisms, but nobody but linguists knows enough about typology to develop & chose between different ones.

      'bimorphism' as in the category-theoretic definition, or something else?

    49. @Avery [from a while back]. thanks for the reference to works by Katz&Postal. Since, as I said earlier, I was not able to find the confusion you allege in these works could you be a bit more specific and give a page reference [one is fine no need to go through all of them].

      You also say: "a grammar, and the formal language that the grammar weakly or strongly generates are all abstract objects, used ... to study the mental structures and behaviors that speakers of languages appear to have and produce."

      It seems you suggest that were we to encounter some physical computer of unknown structure the way to study its structure would be to look at some programming language? If this alone could reveal the structure you do not seem to believe in universal Turing machines [that can be implemented in very different physical structures] but seem to assume every structurally different computer has its own specific programming language?

      It also seems that even people aligned quite closely with Chomsky [like say Angela Friederici] claim that it is the functional neuroanatomist who looks at neurological underpinning for [linguistic] behaviour [for example the cytoarchitectonic structure of the frontal and prefrontal cortex] and at brain activation patterns [when subjects are engaged in linguistic tasks]. I could not find any of her articles on LingBuzz but here is a link to a recent paper that is not behind a paywall:

      Now it seems to me her project is quite different from what Collins&Stabler are doing - yet both seem entirely legitimate projects of inquiry [just as Katz&Postal say] - so again - exactly where does the confusion you attribute to K&P arise?

    50. For a more detailed citation, I'll suggest the first three paragraphs of sec 3 of the 1991 paper, concerning types and tokens. The types are useful classificatory buckets to dump things into, but if there weren't actual linguistic performances to classify with them, they wouldn't be very useful for linguistics, even if mathematicians found some reason to be interested in them. Therefore, according to me, they are part of a theory that helps us understand language, rather than the actual objects of study (for linguists unless they're wearing a mathematical linguistics hat for a while).
      Note for example that peole routinely go beyond the notion of 'type' as encoded in orthography or transcription whenever they look at details of phonological performance, which for me does not constitute any dramatic shift of object of study, but just the addition of some new classificatory dimensions, typically continuous.

      Switching topics, if you encounter a computer of unknown structure with no manual, about the best you can do is try to make sense of its behavior, which in some cases is very opaque, but sometimes a few things can be worked out. In the case of language, quite a lot, due to typology and variation, including that there are implementations of things corresponding to constituent structure and the displacement property.

      Or, in another situation, perhaps old fashioned phonograph records....

      Deducing the existence of 'records' and some other stuff by analysing the behavior of an old fashioned electromechanical jukebox (research conducted by A Andrews, C Allen & E Dresher in El Phoenix Room, Brookline MA, c. 1976, very likely false for 21st century devices).

      Many uninformed people think that as the customers in a restaurant put money and select songs from the song-selection unit at their table, these songs will be played in the order selected. This turns out to be false.

      If you select a song twice in a row, it only gets played once. If you select three songs W, Y, Z, in that order, they don't necessarily get played in the order you select them, but in some invariant order (or some cyclic permutation thereof) typical represented in the catalog at the table.

      From these two facts we can deduce the existence of a cyclic structure on which the songs are arranged, such that selecting one tags that song for being played when the cycle gets around to it (that is, a finite state memory, one register for each song, rather than a pushdown stack storing requests).

      But there's more. There song-list is organized into 'side A' and 'side B' doublets, such that if you select just side B's, they will get played in the same cyclic order as their side A's will be, but if both a side A and a side B are selected, the A will be played on the first pass through the cycle, and B not until the next (with the consequence that you can prevent a B selection from ever being played, by selecting its A before it gets played, and doing so again as soon as it has finished). From this we can conclude that the songs are furthermore organized into functionally asymmetric doublets, each doublet occupying a position on the cycle, with the A member taking priority over the B.

      Peering into the machine revealed the existence of the wheel-like structure, and we had of course already known about the existence of the doublets, but it if we hadn't, we obviously could have deduced their existence from the above observations.

      When people discuss the possibilities for making inferences about the structure of a device from its behavior, they usually talk about pocket calculators, which are literally designed in order to correctly implement
      pre-existing rules, so that the 'Veil of Ignorance' is indeed thick. But for devices not designed to produce specific forms of behavior, the Veil is often thinner (tho, at some point, you do need to look inside).

    51. @Avery: Thanks for your interesting comments. It seems you and P&K mean something rather different when talking about types. For P&K types are abstract objects that exist independently of any physical tokens. [How we can gain knowledge about these types is an intriguing but independent question]. They offer no theory about behavioural dispositions but [just as Collins&Stabler do] work on the relationships between abstract objects [sentences, sets etc.]. You may find this less interesting than working on psychology but there seems nothing confused about the work by K&P.

      I find your juke-box example entertaining but am not sure how it relates to the case of human language [which is what we're really interested in]. You say: "When people discuss the possibilities for making inferences about the structure of a device from its behavior, they usually talk about pocket calculators, [for which] the 'Veil of Ignorance' is indeed thick. But for devices not designed to produce specific forms of behavior, the Veil is often thinner"

      I guess we agree that human brains are not designed to produce specific kinds of behaviour. So lets assume Chomsky's frequent collaborator, martian scientist S looks at linguistic behaviour of say Piraha speakers [and assume Dan Everett is right about the properties of Piraha]. How would S from his field work infer that the behaviour is produced by a device that generates [an infinite range of] self embedded structures?

      Now lets assume S has a colleague M and M has observed behaviour of English speakers. S and M meet and compare field-notes. Based on what would they conclude that a device with the same structure produces both behaviours? It would seem at least possible that based on behavioural evidence alone S would conclude the two languages should be described by different types of formal grammars etc.

      Presumably things would change if S and M had a meeting with Angela Friederici and she assured them that the behaviour of both speakers is generated by an identical device [lets assume she'd do that]. So if you help yourself to this additional piece of knowledge ['peer inside the machine'] the veil of ignorance might be relatively thin. But you claim you can deduce [a lot of] structure from behaviour alone.

      Finally, when I look at the history of GG work it seem that the kinds of structures that have been deduced based on much more than behaviour alone have changed rather dramatically over the years. I have no reason to believe that at any given time Chomsky was NOT putting his best foot forward. If someone of his intellect made such different deductions at different times it would seem the veil of ignorance is fairly thick...

    52. Given S's work alone, one would not conclude that the speakers were best described by grammars with self-embedding (ie they're running FORTRAN interpreters, maybe they're hardwired to run only FORTRAN). Add M's work to the mix, and then the first thing to do is to make sure that the speakers have the same 'language learning potential' (LLP, a.k.a. UG; their children can learn each others languages). That this is true for all people and all languages today is a dogma that seems to be true at least approximately, but actually, I don't think we really know that children with 2K years of Chinese ancestry might not have a detectable if mild deficit in learning Warlpiri or Kayardild, with their complex case-and-other feature-marking systems and extensive scrambling in the case of Warlpiri.

      Having established that the populations have the same LLG, it would then have to the case that they can learn languages with self similar embedding, but that their learning mechanism doesn't acquire them unless exposed to actual examples.

      So if we adopt Chomsky's idea from early generative days of using a grammar notation as a theory of typology and learning (emerging gradually between LSLT/SS and Aspects, I think), we want some kind of PSG-like notation, where 'recursive symbols' are possible but not necessary.

      But PSG-like covers a lot of possibilities; various kinds of dependency and categorial grammars and who knows what else would also work for two languages. Rummaging through all these and more given hundreds of languages should be expected to be time-consuming. Note also that what I consider to be an essential piece of the puzzle originally missed by Chomsky, the Bayesian notion of 'fit', was only acquired fairly recently.

      This is essential to the Piraha debate, since, given somewhat reasonable assumptions about UG (that for example proper names appear under a ('traditional') NP as in my 'Guessing Rule 1' paper on lingbuzz), the learner needs to use negative indirect evidence to suppress recursive possessive structures in Piraha.

      Chomksy's or anybody else's level of intellect is a red herring here, there is just an enormous amount of detail to get on top of. & it's possible that the willingness to fill your brain with wierd facts about the grammar of Greek, Icelandic, Warlpiri etc. is even psychologically contradictory with attaining and maintaining enough mathematical facility to notice when two different-looking formal presentations might really be the same.

    53. Thanks for the comments Avery. Interesting suggestions but i think you overlook the main purpose of Chomsky's martian scientists: they were introduced to force us to eliminate stuff we know about humans [like that your kids would have learned Hungarian had you raised them in Budapest etc.] and focus on what one can infer based on observation alone. So it is not really clear to me that it would occur to S or M that Piraha kids COULD learn English. Even if S and M have concluded [based on their observation] that kids face a poverty of the stimulus situation and concluded kids have an innate mechanism supporting language acquisition it is not clear to me that they would hypothesize the 'Piraha mechanism' can learn English [ex hypothesis S and M are ignorant about human evolution, human biology etc. etc. - it was your claim we do not need to look inside [at least not initially]. Based on observation S and M just might conclude Piraha has a FSG and English has a PSG. And if Friedericic tells them that some non-human primates can deal with FSG but only humans can handle PSG, S and M might wrongly conclude that the Pirahas are no humans.

    54. My claim is not that there is no general need to look inside, but that there are often some things that can be established without doing that, so that the veil of ignorance is not absolute, but thinner in some places than in others. If it never occurred to the Martian scientists that to find out whether S and M's population's children could learn each others languages, they might well assume that each had different UGs with different internal mechanisms. M would then have made an error which could have been averted by being more curious or more conscientious, similar to what often happens with our own species (Bernie Madoff's investors, for example). & in any event what S & M need to do, and what human linguists actually did, was look not 'inside', but 'around' in what we might call the 'natural history of language'. (I'm not sure when the independence of grammar and race was established, the first place I'd look if I wanted to find out would be Whitney's 'Introduction to the Study of Language', hope I've got the name right. Then work back from there.)

      But lazy and incurious M would not be completely wrong; the grammar of Piraha is not really as expected from FSM organization, but from what might be called a 'fixed phrasal hierarchy' organization, which is basically like FORTRAN (traditionally 'nonrecursive') vs. ALGOL-60 (traditionally 'recursive', note reference given above). For a FORTRAN-like language, each subroutine can have a single 'return register' which records the position in the calling routine from which the subroutine is called, so that you can continue when it's done. E.g. if the subroutine is NP called in subject position, the return register will point to a position right before the verb, an NP called in object position, the position right after the object and before whatever comes next (predicate adjective, PP or whatever). A finite number of subroutines each with its own return register is weakly equivalent to the traditional forms of finite state machines, but is captures different generalizations and suggests different things about internal structure.

      And this M's architecture can be upgraded to what's needed for S by replacing the individual return registers with a pushdown store, so they are not completely different stories. But we also know that the pushdown store is 'flakey', so that central self embedding can't go very far.

    55. @Avery: It is probably best we just agree to disagree. You seem convinced that only Chomsky's LF *hypothesis* is correct [in principle though maybe not in detail] and I think not all viable alternatives have been ruled out.

      In addition to this disagreement I notice that in virtually all of your replies you change your previous assumptions to something I did not respond to. Right now you seem to say that behavioural evidence is just one part of evidence, looking 'inside' and looking at even further evidence is important too - we really do not disagree about this point then. I only objected to your much stronger earlier claims re behavioural evidence providing virtually all the interesting information... [the very thin veil of ignorance claims way-above]

    56. LF? Not sure what that is. I think Chomsky thinks the veil is patchy too, and all I intended above to claim to be able to see through it were some facilities related to recursion, basically, subroutines and pushdown stores. This is supposed to be in opposition to people such as K&P and others (Bruce Derwing, Baker & Hacker iirc) who really do seem to think that it is absolute (the level 1.5 stuff, in Martin Davies' version especially, seems excellent to me).

      I also think that C has recently been extrapolating too wildly, although the minimalist ideas do seem to have turned up quite a lot of interesting stuff. So maybe 'too wildly' is just my reaction to him being smarter than I am.

  10. This comment has been removed by the author.

  11. I agree with Alex D in taking a very realist view of what linguistic theories are theories of: a purported natural object in the mind/brain whose properties we are trying to determine. The way we do this is by presenting data of various sorts (acceptability, ambiguity, pars ability etc.). The 'etc' is important for being a realist, the data is open ended potentially as a mechanism is not identical to the data that the mechanism is involved in generating. This is a standard Rationalist conception of theory (see the posts on Cartwright who I think develops the themes in interesting ways) and it has always contrasted with empiricist (tending to instrumentalist) conceptions of theory, which amounts to, more or less, compact ways of representing the data.

    1. A quick thought experiment: consider the situation where we have come up with a theory that completely models the internal device, we even somehow know without a shadow of a doubt that there is no piece of data where the real language organ and our theory disagree. But then it actually turns out that while the two behave exactly the same, the language organ is a lot less appealing than our theory. For instance, where the latter says "a", the former says "not not a". Where we have streamlined, unified data structures and algorithms, the language organ uses a hodgepodge of hacks and tricks. I, for one, would be happy to say in this situation "there's nothing wrong with being better than reality" and stick with the theory we already have. What is your position?

    2. If "sticking with" the theory means anything other than accepting it as a true description of reality, then you are of course free to do that as far as the realist qua realist is concerned.

    3. @Norbert: The `etc' is important for everybody. The empiricist is happy to recognize that the database increases monotonically with time; the great blooming, buzzing confusion simply continues to buzz and bloom.
      As I mentioned above, I typically don't think that it matters whether individuals have rationalist or empiricist interpretations of what they are doing, and I think that you and I are trying to do the same thing. I think it is worrisome that Christina does not recognize my `empiricist' description as `the same thing' as what you are doing. This means that the typically rationalist rhetoric has really obscured (at least to outsiders) what linguists are actually doing.

    4. @Greg: Yup, it is important, and, moreover, open-ended. The problem with identifying the problem as modeling the data is that this supposes there is some way of knowing what the relevant data are. This is an epistemological confusion. We don't know what the right data are for the relevant data is the one that tells us something about the underlying reality. There is no way around this circle. However, if one does not accept this, there is a terrible tendency to declare some data important and others not and to concentrate on that data that one can easily at hand at a given time. This is too bad.

      Re the confusing the outside world: I think that the problem is that too much of the outside world doesn't understand the practice of inquiry very well, including many linguists (again read Cartwright on this in physics). The aim of science is to determine the underlying causal powers. In syntax, this means the underlying structure of FL and it various operations and primitives. There are lots of empirical routes to this end that respond to various methods. The aim is NOT to model the data, but find the underlying structure. If this confuses people, sorry. It's what the other sciences understand as routine. Physicists want, for example, to understand the underlying structure of the atom, say. The measurements are used to find this. The theory is not a theory OF measurements, but of this structure. Indeed, often enough the measurements are discarded as irrelevant because the argument runs they misrepresent the underlying structure. This is my view, the one that one finds in the sciences generally.

      One last thing: I agree that what I say may not comport with what linguists actually do. It is what linguists SHOULD do. Fortunately, what they actually do is often easily reinterpreted in what I consider reasonable terms. Sometimes not. I believe that the disdain for theory is often tied with a misunderstanding of the goal of inquiry. But that's a topic for another post.

    5. @Norbert: Everything is important. We want to understand the world. I'm currently on `team linguistics', but if I can suddenly make sense of some of the data that `team anthropology' is working on, then that takes some of the pressure off of them.

      Be it what the other sciences understand as routine or not, it is clearly wrong. The theory is of some data. You can flatter yourself and say that you are really investigating phlogiston all you want, but at the end of the day, you invented phlogiston (or discovered it, as you prefer) to make sense of the phenomena, and your investigation of the properties of phlogiston are just investigations of the data through this theoretical lens.

      Yes, one often discards some data as irrelevant; biased; uncontrolled. This is common practice, and is in some sense forced upon us by the vastness of data. I think the only philosophically defensible practice is to commit to describing it all, as in MDL (minimum description length). Note that if I had an explanation for some data that you dismissed as irrelevant, I would justly claim that as a virtue of my theory.

    6. @Greg
      My last words on the matter, as continuing will not clarify matters. The theory is not "of" some data but is some structure/power/thing/object. The data is sued to investigate its properties. We use X rays to examine chemical structure. The theory is a theory of this structure, not of the X-ray diffractions. We gather data concerning sunspots on the surface of the sun to a investigate what's going on with the sun's inner structure: the data is the surface temperatures, emissions, spots etc, the target of inquiry is the structure of the sun which accounts for these data. It is not a theory of the data. But you know all of this, I suspect, and disagree.

      BTW, thx for allowing to flatter myself. I will feel much better when I do so. I will leave you with the last word should you wish to take it.

    7. sorry: is "of" some structure/power/thing...

    8. I think we probably agree more than it sounds like we agree, and are talking past one another. I agree that we use words in the way you say we do. You are of course right that no one says `this is a theory of the X-ray diffractions'. I think everyone is a little realist, deep in their hearts, and this has influenced our language. But our theory is supposed to shed light on a bunch of data, and ultimately lives or dies based on how well it does this. If someone comes along with a better theory, does it turn out that we were investigating the properties of something which turned out not to exist? I would say no, we were simply trying to organize the data in an inferior way.
      You are also right that this is a good place to stop. If professional philosophers are still arguing about this, what chance have we for resolution here?

    9. @Greg: careful now, Norbert IS a professional philosopher. And I fear if you believe data can ultimately falsify a theory you subscribe to a view of scientific inquiry that Galilean scientists like Norbert reject. He happily accepts that a theory [or program or whatever] makes false predictions - as long as it is not boring. Fortunately there is no need to go over this again; here is a link to Norbert's ingenious "Against Boredom" addition to philosophy of science: