Faculty of Language: Jeff W comments on comments on recursion

Wednesday, January 15, 2014

Jeff W comments on comments on recursion

I asked Jeff Watumall to respond to some of the points made concerning our previously flagged paper. He was the real driving force behind our joint effort. Thx Jeff. Here's what Jeff has to say.

******

On “On Recursion”

Our paper (http://www.frontiersin.org/Journal/10.3389/fpsyg.2013.01017/abstract) has generated interesting discussion in a previous post (http://facultyoflanguage.blogspot.com/2014/01/more-on-recursion.html). Here I comment on those comments.

Turing and Gödel:

It is no error to equate Turing computability with Gödel recursiveness. Gödel was explicit on this point (I am quoting from numerous Gödel papers in his Collective Works; I can furnish references if requested): “A formal system can simply be defined to be any mechanical procedure for producing formulas, called provable formulas[...]. Turing’s work gives an analysis of the concept of ‘mechanical procedure’ (alias ‘algorithm’ or ‘computation procedure’ or ‘finite combinatorial procedure’). This concept is shown to be equivalent with that of a ‘Turing machine.’” It was important to Gödel that the notion of formal system be defined so that his incompleteness results could be generalized: “That my [incompleteness] results were valid for all possible formal systems began to be plausible for me[.] But I was completely convinced only by Turing’s paper.” This clearly holds for the primitive recursive functions: “[primitive] recursive functions have the important property that, for each given set of values of the arguments, the value of the function can be computed by a finite procedure.” And even prior to Turing, Gödel saw that “the converse seems to be true if, besides [primitive] recursions [...] recursions of other forms (e.g., with respect to two variables simultaneously) are admitted [i.e., general recursions].” However, pre-Turing, Gödel thought that “[t]his cannot be proved, since the notion of finite computation is not defined, but it serves as a heuristic principle.” But Turing proved the true generality of Gödel recursiveness. As Gödel observed: “The greatest improvement was made possible through the precise definition of the concept of finite procedure, which plays a decisive role in these results [on the nature of formal systems]. There are several different ways of arriving at such a definition, which, however, all lead to exactly the same concept. The most satisfactory way, in my opinion, is that of reducing the concept of finite procedure to that of a machine with a finite number of parts, as has been done by the British mathematician Turing.” Elsewhere Gödel wrote: “In consequence of [...] the fact that due to A.M. Turing’s work a precise and unquestionably adequate definition of the general notion of formal system can now be given, a completely general version of Theorems VI and XI [of the incompleteness proofs] is now possible.”

Intension and Extension:

Properly formulated formal systems can be understood as intensionally and extensionally equivalent to Turing machines. In such systems the axiomatic derivations correspond to the elementary computation steps (e.g., reading/writing); this is as constructive as a Turing machine. (There exists a machine that directly performs derivations in the formal system rather than encoding the information in binary strings to be manipulated by the machine.) Accordingly, Gödel did not see formal systems and Turing machines as simply extensionally equivalent: a formal system is as constructive as a proof: “We require that the rules of inference, and the definitions of meaningful formulas and axioms, be constructive; that is, for each rule of inference there shall be a finite procedure for determining whether a given formula B is an immediate consequence (by that rule) of given formulas A₁, ..., A_n[.] This requirement for the rules and axioms is equivalent to the requirement that it should be possible to build a finite machine, in the precise sense of a ‘Turing machine,’ which will write down all the consequences of the axioms one after the other.” This equivalence of formal systems with Turing machines established an absoluteness: “It may be shown that a function which is computable in one of the systems S_i or even in a system of transfinite type, is already computable in S₁. Thus, the concept ‘computable’ is in a certain definite sense ‘absolute,’ while practically all other familiar metamathematical concepts depend quite essentially on the system with respect to which they are defined.” Gödel saw it as “a kind of miracle that”, in this equivalence of computability and recursiveness, “one has for the first time succeeded in giving an absolute definition of an interesting epistemological notion, i.e., one not depending on the formalism chosen.” Emil Post went further into ontology: The success of proving these equivalences raises Turing-computability/Gödel-recursiveness “not so much to a definition or to an axiom but to a natural law” (Post 1936: 105). As a natural law, computability/recursiveness applies to any computational system, including a generative grammar.

Rules and Lists:

The important aspect of the recursive-function/lookup-table distinction is not computability per se (table look-up is trivially computable) but explanation. A recursive function derives--and thus explains--a value. A look-up table stipulates--and thus does not explain--a value. (The recursive function establishes epistemological and ontological foundations.) Turing emphasized this distinction, with characteristic wit, in discussing “Solvable and Unsolvable Problems” (1954). Imagine a puzzle-game with a finite number of movable squares. “Is there a systematic way of [solving the puzzle?] It would be quite enough to say: ‘Certainly [b]y making a list of all the positions and working through all the moves, one can divide the positions into classes, such that sliding the squares allows one to get to any position which is in the same class as the one started from. By looking up which classes the two positions belong to one can tell whether one can get from one to the other or not.’ This is all, of course, perfectly true, but one would hardly find such remarks helpful if they were made in reply to a request for an explanation of how the puzzle should be done. In fact they are so obvious that under the circumstances one might find them somehow rather insulting.” Indeed. A look-up table is arbitrary; it is equivalent to a memorized or genetically preprogrammed list. This may suffice for, say, nonhuman animal communication, but not natural language. This is particularly important for an infinite system (such as language), for as Turing explains: “A finite number of answers will deal with a question about a finite number of objects,” such as a finite repertoire of memorized/preprogrammed calls. But “[w]hen the number is infinite, or in some way not yet completed[...], a list of answers will not suffice. Some kind of rule or systematic procedure must be given.” Gallistel and King (2009: xi) follow Turing’s logic: “a compact procedure is a composition of functions that is guaranteed to generate (rather than retrieve, as in table look-up) the symbol for the value of an n-argument function, for any arguments in the domain of the function. The distinction between a look-up table and a compact generative procedure is critical for students of the functional architecture of the brain. One widely entertained functional architecture, the neural network architecture, implements arithmetic and other basic functions by table look-up of nominal symbols rather than by mechanisms that implement compact procedures on compactly encoded symbols.”

Iteration and Tail Recursion:

This is mathematics, not computer science. (Or, rather, I am a mathematician, now interloping in linguistics. In mathematics, iteration--a general notion applicable to a pattern of succession--is seen as a form of recursion: the function f is defined for an argument x by a previously defined value (e.g., f(y), y < x); but iteration is “tail” recursion given that the previously defined value y is the immediately previously define value.) We are on the computational level, not the level of mechanisms. It is important to recall that Marr and Nishihara (1978) distinguished four--not three--levels: “At the lowest, there is the basic component and circuit analysis--how do transistors (or neurons), diodes (or synapses) work? The second level is the study of particular mechanisms: adders, multipliers, and memories, these being assemblies made from basic components. The third level is that of the algorithm, the scheme for a computation; and the top level contains the theory of computation.” (The theory of computation is mathematical.) Much of the muddling of iteration and tail recursion in the comments on the previous post is the result of misclassifying the level of analysis. “[W]e may consider the study of grammar and UG to be at the level of the theory of computation” (Chomsky 1980: 48). Thus discussion of loops, arrays, etc. is irrelevant. In fact, algorithms and mechanisms are arguable irrelevant in principle. We concur with Chomsky that, for the computational system of language “there’s no algorithm for the system itself; it’s kind of a category mistake. [T]here’s no calculation of knowledge; it’s just a system of knowledge[...]. You don’t ask the question what’s the process defined by Peano’s axioms and the rules of inference, there’s no process” (Chomsky 2013a). Analogously, a Turing machine is not a description of a process or algorithm or mechanism but “a mathematical characterization of a class of numerical functions” (in the words of Martin David (1958: 3), one of the founders of computability theory). Thus to define the faculty of language as a type of Turing machine as we did in our paper, “On Recursion,” is to give a function: “a finite characterization of an infinite set” (Chomsky 2013b). A Turing machine--and thus the language faculty--is defined by a tuple containing a finite set of symbols (axioms), a set of states (with “states” defined as “structures” in the sense of mathematical logic), and a transition function (rule of inference) mapping from state/symbol to state/symbol. “A derivation is thus roughly analogous to a proof with Σ,” a finite set of initial symbols, “taken as the axiom system and F,” the finite set of rewrite rules (or Merge), “[taken] as the rules of inference” (Chomsky 1956: 117), consistent with Gödel’s characterization: “We require that the rules of inference, and the definitions of meaningful formulas and axioms, be constructive; that is, for each rule of inference there shall be a finite procedure for determining whether a given formula B is an immediate consequence (by that rule) of given formulas A₁, ..., A_n[.] This requirement for the rules and axioms is equivalent to the requirement that it should be possible to build a finite machine, in the precise sense of a ‘Turing machine,’ which will write down all the consequences of the axioms one after the other.

89 comments:

Alex DrummondJanuary 15, 2014 at 10:57 AM
Regarding iteration and tail recursion, I think it is rather uncharitable to locate the “muddle” in the minds of the commentators. The term “tail recursion” is one that is standardly used in a computer science context with reference to particular computational mechanisms. (Tail recursion is an interesting special case primarily because it can be implemented in a space-efficient way.) Using the term “tail recursion” in a mathematical context without defining it is bound to lead to confusion. Your response still doesn’t give a definition of tail recursion, so I still don’t know what you mean when you say that a given function is or is not tail recursive. I don’t want to speak for other commentators, but this issue seems to have puzzled a couple of people who have a strong background in mathematics and computer science. (Also, I don’t know which function you’re talking about in the case of Pirahã.)
ReplyDelete
Replies
Alex DrummondJanuary 15, 2014 at 12:05 PM
This comment has been removed by the author.
ReplyDelete
Replies
benjamin.boerschingerJanuary 15, 2014 at 1:41 PM
"Thus to define the faculty of language as a type of Turing machine as we did in our paper, “On Recursion,” is to give a function: “a finite characterization of an infinite set” (Chomsky 2013b). "

- I always thought defining language (or the faculty of language, or whatever) as an X and then drawing conclusions about the thing thus defined (in particular about certain sets thus defined) was something Chomsky(ans) accused their critics of, not something Generative Grammarians were doing. Perhaps the use of "define" here is just somewhat unfortunate, but even then...

- as Alex C said in the other thread,

"Essentially everyone in cognitive science from Turing (1950) onwards has assumed that the human mind/brain and indeed animal mind/brains are computable. This is an upper bound on the power of the brain." (source, my emphasis)

So what's the informative content of "The FLN is (like) a Turing Machine", in particular if we really take care to stick to the purely "computational" level of input-output mapping specification?
ReplyDelete
Replies
Greg KobeleJanuary 15, 2014 at 2:43 PM
I wonder at the reviewing process that let this through. I am fairly well versed in this area, and so I took the liberty of commenting on this paper. (I must post this in batches.)
ReplyDelete
Replies
Greg KobeleJanuary 15, 2014 at 2:44 PM
=== BEGIN COMMENTS ===
- computability ::
+ you define E-language ("departing from Chomsky") as the extension of a function, and then I assume that I-language would be the function qua Turing Machine (or RAM, or lambda term, or Abacus, or Post-system, or partial recursive function term, etc).
+ you claim that "as far as we know, no such distinction applies to [...] non-human animals". In light of the previous definition, it seems like you must be claiming that people working on non-human animals only care about the observed behaviour (extension) and not about any intensional characterization thereof. This seems clearly falsified by experiments on animal pattern learning, where people claim that animals (monkeys, birds, etc) have learned certain patterns (usually finite state, sometimes context-free).
+ The formal bit of this section is a poorly presented version of absolutely basic material present in any textbook on formal language theory.
- induction ::
+ \lambda-definability coincides of course with the class of partial recursive functions.
+ `embroidered', really?
+ the notion of `primitive recursivity' is different from the notion of `partial recursivity'; if you are not mistakenly confusing the two, I do not understand what the `primitive notion of recursion' is.
+ the characterization of recursive functions you give is woefully incomplete (you do not say how to `recursively define' a function in terms of previous ones). A simpler (and correct) way of giving the notion of partial recursivity is as the closure of the set of constant zero function, projection functions, and successor function under the operations of generalized composition, primitive recursion, and minimization functionals. It is hard for a layman to unpack the correct definition from the glib term `recursively define.'
+ your second paragraph is either false or so misleading as to be garbage
+ it is not true that rewrite rules determine successive TM configurations; typically you will have to make multiple TM transitions to simulate one rewrite step.
+ beginning of fourth paragraph. what is it to `represent expliticly the recursiveness'? The definition of `recursive' that we have at this point is Goedels (partial) recursive functions (or, maybe) his primitive recursion functional. Ah, but wait, you define this in the next sentence `recursed [=] returned'. And definitions by induction are defined to be definitions by recursion (it seems again that this is not the primitive recursion functional but rather some intuitive notion). In what sense does a function defined by induction (in your sense) strongly generate something? If you have something concrete in mind, you are simply not expressing it clearly.
+ rest of fourth paragraph. I strongly disagree with this point about weak generation (i.e. strings) being irrelevant for linguistics (the most interesting claims about language universals I know of (Joshi's mild context-sensitivity hypothesis) are made in terms of string sets), however this is not relevant to the main thrust of this article.
+ fifth paragraph. by mathematical induction I assume you mean Peano's axiom to the effect that any set that is closed under successor and contains zero contains all naturals. (Which the reader must infer from the bit about the successor function in the Goedel quote.) Your goal in this paragraph seems to be motivating that the parts of natural languages that we observe should be analysed as finite projections of infinite sets. Your strategy is to show that the formalism of transformational grammar allows you to define infinite sets. I cannot make sense of the analogy to the successor function.
ReplyDelete
Replies
Greg KobeleJanuary 15, 2014 at 2:45 PM
- unboundedness ::
+ This first paragraph is confused.
+ Next paragraph, you want to say that a (partial?) recursive function might have an infinite range, and yet, something which we might want to view as implementing that function might fail to behave as that function predicts on more than a finite number of inputs. This is typical fare for cognitive science (in the vein of Pylyshyn and Marr).
+ next paragraph. this is a conjecture, on your parts, that whatever is responsible for the arbitrary limit on outputs is liftable in a principled way. This may be true for certain deviations from the I-language ideal, but not for others (such as garden paths, which sometimes require training to see). Grammars generating infinite sets are postulated in order to account for the regularities of behaviour (special sciences, ceteris paribus, etc), and these arbitrary limits to account for why the predictions of the grammar aren't borne out. The move to a grammar generating an infinite set is justified to the extent that the combination of the grammar plus limits is together more elegant than a finite list. Therefore, this is both an empirical conjecture about possible behaviour and a suggestion about how to account for this postulated behaviour should it actually be manifest. Thanks to people like Paul Smolensky we know how to implement symbolic algorithms in connectionist nets. Furthermore, work on animal pattern recognition commonly assumes that animals are `in fact' learning patterns (like $A^*B^*$) which in actual fact they would certainly not recognize the longer instances of.
+ last paragraph. I can no longer make sense of this in terms of (partial) recursive functions.
- recapitulation ::
+ here you make something like a concrete proposal.
- computability ::
all things that we can give algorithmic descriptions of are computable, from finite sets to infinte sets. clearly, saying that FLN is computable is not saying very much.
- definition by induction ::
you have nowhere defined what you mean this to be. do you mean least fixed point computations? this has nothing to do with `strong generation'. Strong generation is not a well-defined term, and has nothing to do with generating strings vs trees; tree grammars weakly generate sets of trees, and trees can be encoded as strings.
- mathematical induction ::
I fail to see what role mathematical induction should play here. I understand the principle of mathematical induction as allowing you to conclude from the facts that zero is in a set, and that that set is closed under successor, that that set contains all natural numbers.
- caps and gaps ::
+ Here you must be using `recursiveness' in a sense different from meaning `(partial) recursive function'. You are also committing the fallacy of assuming that which you would like to show; `second any limitations on depths of embedding in structures that FLN does generate can only be arbitrary' because you have assumed that FLN only generates infinite sets. Actually, you haven't explicitly said this, but you appeal to it anyways. I personally see no reason to make this claim; no grammar formalism that I am aware of only allows you to define infinite sets.
+ You say one relevant thing in paragraph four, by way of arguing against Everett. This is that even if he is correct in his description of the behaviour(al dispositions) of the Piraha, it would be simpler to describe their behaviour as being underlain by an infinite set. This is an empirical claim, but it is the right kind of argument to make.
+ You assert, `it follows from mathematical law that recursion is unlearnable', yet you cite nothing. In diverse learning paradigms (Gold, PAC, MDL,...) infinite classes of infinite languages are learnable.
ReplyDelete
Replies
Greg KobeleJanuary 15, 2014 at 2:45 PM
- evolution ::
+ paragraph four. Okay, some claims.
- We need to show the `spontaneous' display of:
1. Computability
`proof of a procedure [other than] a look up table'
2. Definition by induction.
`outputs must be carried forward and returned as inputs'; `outputs [should be] represented hierarchically'
3. Mathematical induction.
`generalization beyond the exposure material'
- I do not know what `spontaneous' is supposed to mean here; what counts, and why should this be accepted?
- Note that none of these three terms (in these points) is being used in the way you (tried to) define them earlier.
- points 1 and 3, viewed together as `the animal should display behavioural creativity' (on analogy with linguistic creativity in its standard chomskyian usage) seem relatively uncontroversial
- point 2 has two parts.
1. outputs must be carried forward and returned as inputs
I can see no motivation for this. This is not a description at Marr's computational level, but rather at the algorithmic one. As others have pointed out, this is either trivial (every computation must do this) or unnecessary (an algorithm using recursion can be rewritten into one without).
2. outpus should be represented hierarchically
a string is a unary branching tree. if you want some particular kind of hierarchical structure to be a /big deal/ then you should have written a paper on that. Note that many vision researchers treat the process of image recognition as involving hierarchical structure. I think that it might not be easy to have the one but not the other.

furthermore, your example of path integration is fundamentally flawed. Let us, instead of representing the outputs as a single vector, represent them as a term over vector space operations (an element of the free vector space). This is a hierarchically structured object (it is a tree). There is also a canonical homomorphism from it to the desired vector. In my view, this is /exactly/ what direct compositionality is about. I imagine that you wouldn't want to claim that FLN is not directly compositional... Given that we have direct compositional approaches to minimalism (shameless plug), this is even more unpalatable.
- universality ::
+ because it is not clear what you mean by `recursion', I cannot be sure I am responding to what you are trying to get at. But if you mean `(partial) recursive function', then you are wrong; it is not a discovery that language is describable as such a thing -- if it is describable at all, then as this. If you only mean `total recursive function', then there is some content there (not much). But then I would invite you to look into the wealth of work on mathematical linguistics for much more developed, sophisticated, and restrictive proposals.
- super-sentential syntax ::
+ `unless we have truly compelling evidence that it is not $f_{MERGE}$, we should assume on general grounds of parsimony that it is $f_{MERGE}$.' This does not come for free. You have to argue for this.
=== END COMMENTS ===
ReplyDelete
Replies
Alex ClarkJanuary 16, 2014 at 2:26 AM
Do you think for example that nondeterministic Turing machines, Post systems and deterministic Turing machines are all intensionally equivalent?

If so then I think I don't understand what "intensional" means in this context; a definition would be helpful. How does intensional equivalence differ from extensional equivalence?
ReplyDelete
Replies
UnknownJanuary 16, 2014 at 5:47 AM
@Greg: Thanks for the detailed comments, I might make another attempt at working through the paper with your evaluation next to it.

@everybody who has ever been part of the recursion paper war: May I ask why this debate has to be so complicated? The only reason there is a discussion at all is that people like Everett object to the notion that recursion is an integral part of language. So all you have to do is look at the available notions of recursion and show that they hold of language:

1) computable by some recursive function: true given standard assumptions about the brain being at most a Turing Machine
2) human languages (when viewed as sets of strings) are recursive languages: true, because no known construction pushes language beyond PMCFLs
3) recursion as self-embedding: true in various languages even when analyzed in a non-generativist framework; the fact that not all languages may allow for self-embedding is about as interesting as the fact that context-free grammars, which can handle center embedding, can generate regular languages. Just because FLN has a certain amount of power does not mean that this power is instantiated in every language.

As far as I can tell, that pretty much covers all positions that one could try to falsify empirically, and they all hold.

Now it is true that if one wants to make precise the original claim that Merge is an essential property of language, one enters more slippery terrain because that requires a precise definition of Merge, some assumptions about what the explanandum is, and a proof that you couldn't have carved things up differently. That's pretty much what Greg points out in his last comment. But that's the second level in the debate (World 2: Merge), and I don't understand why after so many years we are still stuck in the first one (World 1: Recursion).
ReplyDelete
Replies
Alex DrummondJanuary 16, 2014 at 7:41 AM
This comment has been removed by the author.
ReplyDelete
Replies
NorbertJanuary 19, 2014 at 11:04 AM
I agree with Alex D in taking a very realist view of what linguistic theories are theories of: a purported natural object in the mind/brain whose properties we are trying to determine. The way we do this is by presenting data of various sorts (acceptability, ambiguity, pars ability etc.). The 'etc' is important for being a realist, the data is open ended potentially as a mechanism is not identical to the data that the mechanism is involved in generating. This is a standard Rationalist conception of theory (see the posts on Cartwright who I think develops the themes in interesting ways) and it has always contrasted with empiricist (tending to instrumentalist) conceptions of theory, which amounts to, more or less, compact ways of representing the data.
ReplyDelete
Replies
MS DaveApril 1, 2020 at 7:57 AM
Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your posts. Any way I'll be subscribing to your feed and I hope you post again soon..
connect brother printer to wireless network
ReplyDelete
Replies

Add comment

Faculty of Language

Comments

Wednesday, January 15, 2014

Jeff W comments on comments on recursion

89 comments:

Contributors