Sometimes you read something that, if correct, will lead to deep disruption of the accepted wisdom. In my intellectual life, I’ve read a few such pieces (Chomsky’s review of Skinner, Syntactic Structures, Chapter 1 of Aspects and “on Wh-movement” (among many others), Chapter 1 of Kayne’s antisymmetry book, Chapter 1 of Rizzi’s book on minimality, Fodor on modularity and the language of thought hypothesis spring immediately to mind. Recently, I read another. The paper is by Randy Gallistel (here) and it does three important things: (i) It outlines the relevance of the Language of Thought Hypothesis (LOT) for the computational theory of the brain, (ii) It discusses the main problems with the received neuro-wisdom concerning brain computation and shows how it is incompatible with what we know about mental computations and (iii) it provides evidence for the hypothesis (what I have elsewhere referred to as the Gallistel-King Conjecture (here, here)) that the neuron, rather than the neuronal ensemble (i.e. net), is the primary unit of brain computation. This last is what is truly revolutionary if correct, for it states that neuro-science has been looking in the wrong place for the computational architecture of the brain. And that’s a big deal. A very big deal. In fact, in neuro-science terms it is as big a deal as one can imagine. Let me say a little about each of these points. But before doing so, let me strongly encourage you to download and read Randy’s paper. It is exceedingly readable and very didactic. Let me also thank Randy for allowing me to link to it here. IMO, this is something that all linguists of the generative variety (i.e. those right thinking people that have a cognitive take on the linguistic enterprise) should read, understand and absorb.
The paper is intended as an encomium to Jerry Fodor and the LOT. Randy starts by observing the very tight connection between LOT and the current standard understanding of neural architecture where the unit of analysis is the net and the main operation is in/de-creasing connection strengths between nodes in the net. This is what the standard neuro wisdom takes thinking to effectively consist in; reordering connection strengths among neurons in response to experience. Randy notes that this picture is incompatible with the LOT. In other words, if the LOT is right, then the current neural fascination with nets and connections is just wrong in that it is looking for the physical bases of cognition in the wrong place. Why? Because if LOT is correct then cognition is symbolic and thinking consists in operations over these symbols and neural nets suck at symbolic computation.
This should not be news to those that have followed Fodor and Pylyshyn’s and Gallistel and King’s and Marcus’s critiques of connectionism over the last 30 years. What Randy adds here to this critique is the observation that connectionist have completely failed to address any of them. They have failed to show how such systems could symbolically compute anything at all. As Randy puts it (p. 2), there is no model of how such a brain could even add two numbers together:
"There is, however, a problem with this hypothesis: synaptic conductances are ill suited to function as symbols (Gallistel and King 2010). Anyone who doubts this should ask the first neuroscientist they can corner to explain to them how the brain could write a number into a synapse, or into a set of synapses. Then, step back and watch the hands wave. In the unlikely event of an intelligible answer, ask next how the brain operates on the symbols written into synapses. How, for example, does it add the number encoded in one synapse (or set of synapses) to the number encoded in a different synapse (or set…) to generate yet another synapse (or set…) that encodes the sum?"
The reason is that such “brains” cannot actually manipulate symbols, unlike, say, a classic machine with a Turing architecture (i.e. one with read/write memory and indirect addressing to name two important features).
Why don’t neuroscientists find this disturbing? Randy notes one important reason: they are all card carrying Empiricists with a strong commitment to associationism. And given this psychology, there is nothing wrong with neural nets (e.g. even the “fires together, wires together” slogan has the pungency of empiricism around it). Of course, if LOT is right then associationsim is dead wrong (as Fodor has repeatedly emphasized to no apparent avail), and so if Fodor is right then the neurosciences are destined to get nowhere as they simply misunderstand what the brain needs to do in order to mentate. It needs to symbolically compute and any brain architecture that cannot do that (and do that fairly easily) is the wrong brain architecture.
I, of course, find this completely compelling. However, even if you don’t, the paper is still worth reading for the relation it establishes between Empiricism/associationism and connectionist-synaptic theories of the brain. This link explains a lot about the internal logic of each (and hence their staying power). Each position reinforces the other, which is why connectionists tend to be associationists and vice versa. Again as Randy puts it (p. 3):
"…the synaptic theory of memory rested on a circularly reinforcing set of false beliefs: The neuroscientists’ belief in the synaptic theory of memory was sustained in no small measure by the fact that it accorded with the psychologists’ associative theory of learning. The psychologists’ belief in the associative theory of learning was sustained in no small measure by its accord with what neuroscientists took to be the material realization of memory."
Of course, the tight link between the two also shows how to bring the whole thing (hopefully, crashing) down: debunk Empiricism/associationsim then bye bye connectionism. That in fact is what Randy concluded a while ago: Empiricism is clearly false so that any brain architecture that relies on its being true is also false. Good argument. Sane conclusion. Or as Jerry Fodor might have put this: neuroscience’s modus ponens is Randy’s modus tollens. Nothing so gratifying as watching someone being hoist on their own petard!
Importantly, Randy has over the years presented many other arguments against this conception of neural computation. He has argued that the basics of the theory never delivered what was promised. I discussed a recent review paper by Gallistel and Matzel (here) that demonstrates this in detail (and that I urge you all to read). So, not only is the learning theory widely assumed (at least tacitly) by the neuro-science community wrong from a cognitive point of view, but the theory’s purported mechanisms get you very little empirically even in its own terms. Given this, why hasn’t the whole thing collapsed already? Why are we still talking as if there is something to learn about the computational powers of the brain by treating it as a connectionist network of some sort?
The main reason anything hangs around even if patently inadequate is that there is nothing else on offer. As the saying goes, you cannot beat something with nothing, no matter how little that nothing contains. So, we had nets because we had no real alternative to neural/synaptic computations (and, of course, because Empiricism is rampant everywhere in the cog-neuro world. It really seems to be innate, as Lila once quipped). Until now. Or this is what Randy argues.
Randy reviews various kinds of evidence that the brain unit of computation is the single neuron. In other words, thinking goes on inside single neurons not (or not only) among neurons. Why there?
Randy provides various reasons for thinking that this is the right place to look. First, he provides some conceptual reasons (truth be told, these convinced me a while ago, the more recent empirical evidence being tasty icings on the cake). Randy points out that all the hardware required for symbolic computation lives inside each cell. DNA, RNA are mini digital computers. They can write information to memory, read it from memory, allow for indirect addressing (and hence variable binding), etc. Thus, each cell carries with it lots of computing power of the sort required for symbolic computation, and it uses it to carry inherited information forward in time and computationally exploits this information in development in building an organism.
Indeed, we’ve known this for a very long time. It goes back at least to Watson and Crick’s discovery that genes code for information. Recently, we’ve further discovered that some of the genetically coded information is very abstract. Genes and proteins can code for “eyes” and body positions (e.g. anterior, dorsal, distal) independently of the detailed programs that build the very specific actual structures (e.g. human vs insect eye). As Randy puts it (p. 3):
"The old saw that genes cannot represent complex organic structures and abstract properties of a structure is simply false; they can and they do."
So, given that we’ve got all this computing power sitting there in our neuronal cells a natural question arises: how reasonable is it that the greatest of all innovators, Mother Nature, never thought to use this available computational system for cognitive ends? To Randy’s (and my) mind, not very. After all, if a nose can become a “hand” or a wrist bone a “thumb” why couldn't a digital computational system in place for carrying inherited information across generational time be repurposed to carry acquired cognitive information within an organism’s lifetime. Information is information. Carry one kind and you can carry another. Indeed, given the availability of such a useful digital coding architecture in every living creature the possibility that evolution never “thought” to repurpose it for cognitive ends seems downright un-Darwinian (and no card carrying rational scientist would want to be that nowadays!). So, it seems that only a believer in a malevolent, slovenly deity, should reject the likelihood that (at least part of) thinking supervenes on DNA/RNA/protein computational machinery. And this machinery resides inside each neuron, hence thinking is (at least in part) intra (not inter)-neuronal (i.e. the unit of computation is the neuron). Or as Randy puts this (p. 6):
"…processes operating within the cells at the level of individual molecules implement the basic building blocks of computation, and they do so in close connection with the reading of stored information. The information in question is hereditary information, not experientially acquired information. Nonetheless, it is tempting to think that evolution long ago found a way to use this machinery, or closely related machinery, or, at least, functionally similar molecular machinery, to do the same with experientially acquired information." (emphasis is Randy’s)
Yes indeed. Very very tempting. Some might say “irresistible.”
Randy notes other virtues of intra-neuronal molecular computation. He touts the efficiency of such a chemically based neuronal computation (p. 6). It can pack lots of info in very small dense package and this allows both for computational speed and energy efficiency. Apparently, this is the trick that lies behind the huge increasing in computing power we’ve seen over the last decades (putting more and more circuitry and memory into smaller and smaller packages) and taking biological computation to be grounded in large molecules within the neuron yields similar benefits as a matter of course.
So, a mechanism exists to support a cognitively friendly conception of computation and that mechanism is able to stably store a vast amount of info cheaply and is able to computationally use it quickly and at low (energy) cost. Who could want more? So conceptually speaking, wouldn’t it have been very dumb (if not downright criminal) if Mother Nature (coy tinkerer that she is) had not exploited this readily available computational machinery for cognitive ends? Guess what I think?
So much for the “conceptual” arguments. I find these completely compelling, but Randy rightly notes that these have heretofore convinced nobody (but me, it seems, a sad commentary on the state of neuro-science IMO). However, recently, experimental evidence has arisen to support this wonderful idea. I’ve mentioned before some tantalizing bits of evidence (here and here) in favor of this view. Randy reviews some new experimental stuff that quite directly supports his basic idea. He discusses three recent experiments, the most impressive (to me) being work in
Norway Sweden (A correction: my mistake and apologies to Nordics everywhere) that shows that the acquired
eyeblink response in ferrets “resides within individual Purkinje cells in the
cerebellar conrtext” (7). Randy describes their paper (not an easy read for a
tyro like me) in detail, as well as two other relevant experiments. It’s a bit
of a slog, but given how important the results are, it is worth the effort.
In the popular imagination science is thought of as the careful accumulation of information leading to the development of ever more refined theories. This is not entirely false, but IMO, it is very misleading. Data is useful in testing and elaborating ideas. But, we need the ideas. And, surprisingly, sometimes some hard thinking about a problem can lead to striking novel ideas, and even the dislodging of well-entrenched conceptions. Chomsky did this in his review of Skinner and subsequent development of generative grammar. Randy’s suggestion that thinking is effectively DNA/RNA/Protein manipulation that takes place within the single cell does this for neuro-science building on the Fodor/Chomsky conception of the mind as a computational symbolic processing engine. Such a conception demands a Turing like brain architecture. And, it seems, that nature has provided such at the molecular level. DNA/RNA/proteins can do this sort of stuff. Amazingly, we are starting to find evidence that what should be the case, indeed actually is the case. This is scientific thinking at its most exciting. Read the paper and enjoy.
 This points bears repeating: what Fodor&Pylyshyn and Marcus argued effectively is that connectionist models of the mind were deeply inadequate. They did not argue against such models as brain models (i.e. on the implementation level). Randy extends this criticism to implementational level: not only is connectionism a bad model of the mind, it is also a bad model of the brain. Interestingly, it is a bad model of the brain precisely because it cannot realize the architecture needed to make it a good model of the mind.
 If correct, this is an especially powerful argument. It is one thing not to get what others think important, quite another to not even gain much of what you think is so. This is what marks great critiques like Chomsky’s of Skinner, Fodor&Pylyshyn’s and Marcus’ of connectionist mental architectures and Gallistel’s of connectionist models of the brain.
 Randy’s criticisms of neural nets as units of computation invites the strong reading that computation only goes on intra-neuronally. But this need not be correct for Randy’s main point to be valid, viz. that lots of computing relevant to cognition goes on inside the neuron, even if some may take place inter-neuronally.
One interesting trend among the folks who you might call "3rd generation connectionists” (i.e., today’s deep learning coterie) is how much they embrace the computational impossible objections that Gallistel (and Fodor and Pylyshyn and ...) rightly criticize computation-as-association for having. If you want to compute interesting things and be able to learn to make the right kind of generalizations, the traditional connectionist model is simply not appropriate. Thus, you have papers like this overhyped one, I suspect wouldn’t be symbolic enough to pacify true Fodorians, but which still represents a very different computational beast than what Gallistel et al. are criticizing. Admittedly, I don't know if many of these guys are thinking seriously about cog sci—they're too busy spending the big bags of money Google/Facebook/Microsoft gives them—but the (re)volution is worth noting. And it is an admission on the part of the connectionists (even if it’s just the “applied” ones) that there is more to life than synaptic weights.ReplyDelete
This brings me to my second point that makes me question whether this is truly the connectionist death-knell. Whereas Gallistel rightly objects that old-school connectionists "hand-wave" about associationist (non-)solutions to hard information processing problems, it seems to me that new-school symbolic types are equally guilty about hand-waving when it comes to learning (if you think I'm mistaken here, I would welcome any citations). Learning by risk minimization (i.e., what back-propagation does, and other algorithms do too) is at least as old—and probably way older—than the symbol processing system found in RNA/DNA. So to my mind, we are at an impasse: we either have an impoverished computational system, or an impoverished learning system. Perhaps due to my formative years at Maryland, I’m actually pretty happy to work with impoverished computational systems (it’s just another reduction of the hypothesis space!), but I’m loathe to give up on formal models of learning. Taste and intuition may guide others in other directions, but eh, it’s a big world.
Who is asking you to give up on "learning"? Randy is just asking that you see what things are being learned and understand what this presupposes. If we learn IS intervals (numbers representing the time between CS onset and US) then we need a mechanism to do this. We need ways of coding numbers in memory that can then be retrieved and further computed with. Does this mean that we don't also need a story about why it takes lots of exposures to the CS/US pair to do this? No. But it does mean that the knowledge attained is symbolic and this has consequences for both mind and brain architecture.
The trick, as Gallistel notes in his book with King, is to combine symbolic systems with probabilities. He is a Bayesian as he thinks that this is where the two can meet. I am not expert enough to have an opinion. But form where I sit IF he is right, that's fine with me. So, you won't find many symbolic types denying that probabilities matter, whereas you do find many connectionists denying that mentation invokes symbolic computation. Or so it seems to me.
I haven't talked to too many connectionists directly, but there are definitely proponents of embodied cognition who deny computation and representation:Delete
I have a feeling that the connectionists and the embodied folks hang out at conferences and have drinks together. To be honest, though, it seems like everyone in cog neuro if a connectionist even if they believe in representation and computation. Which means they need to read G&K's 2009 book and this article and understand the arguments presented there.
This comment has been removed by the author.Delete
@William, I'm confused by the claim "everyone in cog neuro [is] a connectionist even if they believe in representation and computation." If you have a "connectionist" model that has computation, representation, memory, and indirection, in what sense do G&K's arguments against connectionism hold? I'll grant you that the stuff I'm referring to is new, fairly marginal stuff (although as I see antecedents of it going back 10 or 15 years, maybe longer), and I'm reading just in the ML literature (not the psych/neuro/cogsci lit), but it seems like many of the requirements for G&K's computational primitives of mentation do exist in these architectures. (I will also confess I haven't read G&K since it was a very early pre-pub draft, but I've ordered the book, so I'll catch up soon).Delete
The comment wasn't intended as a claim; it was an observation - all of the people I know in cognitive neuroscience also happen to like connectionist modeling, in the common parlance in which 'connectionism' is used, to refer to activation values and weights as the relevant parameters.Delete
There's a line of logic that I've been think about more and more as I have read this blog that I'd like to test out. Please point out any flaws.ReplyDelete
1) Optimality theory is based on a connectionist theory of cognition.
2) Connectionist theories of cognition are based on synaptic theories of neuroscience.
3) Synaptic theories of neuroscience seem to be wrong.
4) Therefore, Connectionist theories are baseless.
5) Therefore, OT is baseless.
Obviously this is far too brief to be convincing to most OT practitioners, and as a syntax-semantics guy I don't really spend that much time in the OT literature, so I am probably oversimplifying ...
Not sure about premiss (1). Does "based on" mean "require" or "is consistent with." If the latter, then the argument seems overdrawn. I am not an OTer, but from the little I know, I don't see that the strong assumption is required.Delete
My sense is that it might be somewhere between a requirement and consistency with. Prince and Smolensky, definitely tout the consistency with connectionism as a selling feature for OT. It's probably OTs main theoretical "advantage" over rule-based approaches.Delete
So premiss (1) is a hypothesis of sorts
Well, if Gallistel is right then it's not a great advantage, and maybe even worse.Delete
watch all spanish daramas here in HD quality ·comamosr amenReplyDelete