Sometimes you read something that, if correct, will lead to
deep disruption of the accepted wisdom. In my intellectual life, I’ve read a
few such pieces (Chomsky’s review of Skinner, Syntactic Structures, Chapter 1 of Aspects and “on Wh-movement” (among many others), Chapter 1 of
Kayne’s antisymmetry book, Chapter 1 of Rizzi’s book on minimality, Fodor on
modularity and the language of thought hypothesis spring immediately to mind.
Recently, I read another. The paper is by Randy Gallistel (here)
and it does three important things: (i) It outlines the relevance of the
Language of Thought Hypothesis (LOT) for the computational theory of the brain,
(ii) It discusses the main problems with the received neuro-wisdom concerning
brain computation and shows how it is incompatible with what we know about
mental computations and (iii) it provides evidence for the hypothesis (what I
have elsewhere referred to as the Gallistel-King Conjecture (here,
here)) that the neuron, rather than the neuronal ensemble (i.e. net), is
the primary unit of brain computation. This last is what is truly revolutionary
if correct, for it states that neuro-science has been looking in the wrong
place for the computational architecture of the brain. And that’s a big deal. A very big deal. In fact, in neuro-science
terms it is as big a deal as one can imagine. Let me say a little about each of
these points. But before doing so, let me strongly encourage you to download
and read Randy’s paper. It is exceedingly readable and very didactic. Let me
also thank Randy for allowing me to link to it here. IMO, this is something
that all linguists of the generative
variety (i.e. those right thinking people that have a cognitive take on the linguistic
enterprise) should read, understand and absorb.
The paper is intended as an encomium to Jerry Fodor and the
LOT. Randy starts by observing the very tight connection between LOT and the
current standard understanding of neural architecture where the unit of
analysis is the net and the main operation is in/de-creasing connection
strengths between nodes in the net. This is what the standard neuro wisdom
takes thinking to effectively consist in; reordering connection strengths among
neurons in response to experience. Randy
notes that this picture is incompatible with the LOT. In other words, if the
LOT is right, then the current neural fascination with nets and connections is
just wrong in that it is looking for the physical bases of cognition in the
wrong place. Why? Because if LOT is correct then cognition is symbolic and
thinking consists in operations over these symbols and neural nets suck at
symbolic computation.
This should not be news to those that have followed Fodor
and Pylyshyn’s and Gallistel and King’s and Marcus’s critiques of connectionism
over the last 30 years. What Randy adds here to this critique is the
observation that connectionist have completely failed to address any of them.
They have failed to show how such systems could symbolically compute anything
at all. As Randy puts it (p. 2), there is no model of how such a brain could
even add two numbers together:
"There is, however, a problem with this hypothesis: synaptic
conductances are ill suited to function as symbols (Gallistel and King 2010). Anyone who doubts this
should ask the first neuroscientist they can corner to explain to them how the
brain could write a number into a synapse, or into a set of synapses. Then,
step back and watch the hands wave. In the unlikely event of an intelligible
answer, ask next how the brain operates on the symbols written into synapses.
How, for example, does it add the number encoded in one synapse (or set of
synapses) to the number encoded in a different synapse (or set…) to generate
yet another synapse (or set…) that encodes the sum?"
The reason is that such “brains” cannot actually manipulate
symbols, unlike, say, a classic machine with a Turing architecture (i.e. one
with read/write memory and indirect addressing to name two important features).
Why don’t neuroscientists find this disturbing? Randy notes
one important reason: they are all card carrying Empiricists with a strong
commitment to associationism. And given this psychology, there is nothing wrong
with neural nets (e.g. even the “fires together, wires together” slogan has the
pungency of empiricism around it). Of course, if LOT is right then associationsim
is dead wrong (as Fodor has repeatedly emphasized to no apparent avail), and so
if Fodor is right then the neurosciences are destined to get nowhere as they
simply misunderstand what the brain needs to do in order to mentate. It needs
to symbolically compute and any brain architecture that cannot do that (and do
that fairly easily) is the wrong brain architecture.[1]
I, of course, find this completely compelling. However, even
if you don’t, the paper is still worth reading for the relation it establishes between
Empiricism/associationism and connectionist-synaptic theories of the brain.
This link explains a lot about the internal logic of each (and hence their
staying power). Each position reinforces the other, which is why connectionists
tend to be associationists and vice versa. Again as Randy puts it (p. 3):
"…the synaptic theory of memory
rested on a circularly reinforcing set of false beliefs: The neuroscientists’
belief in the synaptic theory of memory was sustained in no small measure by
the fact that it accorded with the psychologists’ associative theory of
learning. The psychologists’ belief in the associative theory of learning was
sustained in no small measure by its accord with what neuroscientists took to
be the material realization of memory."
Of course, the tight link between the two also shows how to
bring the whole thing (hopefully, crashing) down: debunk
Empiricism/associationsim then bye bye connectionism. That in fact is what
Randy concluded a while ago: Empiricism is clearly false so that any brain
architecture that relies on its being true is also false. Good argument. Sane
conclusion. Or as Jerry Fodor might have put this: neuroscience’s modus ponens
is Randy’s modus tollens. Nothing so gratifying as watching someone being hoist
on their own petard!
Importantly, Randy has over the years presented many other
arguments against this conception of neural computation. He has argued that the
basics of the theory never delivered
what was promised. I discussed a recent
review paper by Gallistel and Matzel (here)
that demonstrates this in detail (and that I urge you all to read). So, not
only is the learning theory widely assumed (at least tacitly) by the
neuro-science community wrong from a cognitive point of view, but the theory’s
purported mechanisms get you very little empirically even in its own terms.[2] Given this, why hasn’t the whole thing
collapsed already? Why are we still talking as if there is something to learn
about the computational powers of the brain by treating it as a connectionist
network of some sort?
The main reason anything hangs around even if patently inadequate
is that there is nothing else on offer. As the saying goes, you cannot beat
something with nothing, no matter how little that nothing contains. So, we had
nets because we had no real alternative to neural/synaptic computations (and,
of course, because Empiricism is rampant everywhere in the cog-neuro world. It
really seems to be innate, as Lila once quipped). Until now. Or this is what
Randy argues.
Randy reviews various kinds of evidence that the brain unit
of computation is the single neuron. In other words, thinking goes on inside single neurons not (or not only)
among neurons.[3] Why there?
Randy provides various reasons for thinking that this is the
right place to look. First, he provides some conceptual reasons (truth be told,
these convinced me a while ago, the more recent empirical evidence being tasty icings
on the cake). Randy points out that all the hardware required for symbolic computation
lives inside each cell. DNA, RNA are mini digital computers. They can write
information to memory, read it from memory, allow for indirect addressing (and
hence variable binding), etc. Thus, each
cell carries with it lots of computing power of the sort required for symbolic
computation, and it uses it to carry inherited information forward in time and
computationally exploits this information in development in building an
organism.
Indeed, we’ve known this for a very long time. It goes back
at least to Watson and Crick’s discovery that genes code for information.
Recently, we’ve further discovered that some of the genetically coded
information is very abstract. Genes and proteins can code for “eyes” and body
positions (e.g. anterior, dorsal, distal)
independently of the detailed programs that build the very specific actual
structures (e.g. human vs insect eye). As Randy puts it (p. 3):
"The old saw that genes cannot
represent complex organic structures and abstract properties of a structure is
simply false; they can and they do."
So, given that we’ve got all this computing power sitting
there in our neuronal cells a natural question arises: how reasonable is it
that the greatest of all innovators, Mother Nature, never thought to use this available
computational system for cognitive ends? To Randy’s (and my) mind, not very.
After all, if a nose can become a “hand” or a wrist bone a “thumb” why couldn't
a digital computational system in place for carrying inherited information
across generational time be repurposed to carry acquired cognitive information
within an organism’s lifetime. Information is information. Carry one kind and
you can carry another. Indeed, given the availability of such a useful digital
coding architecture in every living creature the possibility that evolution never “thought” to repurpose it for
cognitive ends seems downright un-Darwinian (and no card carrying rational scientist
would want to be that nowadays!). So, it seems that only a believer in a
malevolent, slovenly deity, should reject the likelihood that (at least part
of) thinking supervenes on DNA/RNA/protein computational machinery. And this
machinery resides inside each neuron,
hence thinking is (at least in part) intra (not inter)-neuronal (i.e. the unit
of computation is the neuron). Or as Randy puts this (p. 6):
"…processes operating within the
cells at the level of individual molecules implement the basic building blocks
of computation, and they do so in close connection with the reading of stored
information. The information in question is hereditary information, not
experientially acquired information. Nonetheless, it is tempting to think that
evolution long ago found a way to use this machinery, or closely related machinery, or, at least, functionally similar
molecular machinery, to do the same with experientially acquired
information." (emphasis is Randy’s)
Yes indeed. Very very tempting. Some might say “irresistible.”
Randy notes other virtues of intra-neuronal molecular
computation. He touts the efficiency of such a chemically based neuronal computation
(p. 6). It can pack lots of info in very small dense package and this allows
both for computational speed and energy efficiency. Apparently, this is the
trick that lies behind the huge increasing in computing power we’ve seen over
the last decades (putting more and more circuitry and memory into smaller and
smaller packages) and taking biological computation to be grounded in large
molecules within the neuron yields similar benefits as a matter of course.
So, a mechanism exists to support a cognitively friendly
conception of computation and that mechanism is able to stably store a vast
amount of info cheaply and is able to computationally use it quickly and at low
(energy) cost. Who could want more? So conceptually speaking, wouldn’t it have been
very dumb (if not downright criminal) if Mother Nature (coy tinkerer that she
is) had not exploited this readily available computational machinery for
cognitive ends? Guess what I think?
So much for the “conceptual” arguments. I find these
completely compelling, but Randy rightly notes that these have heretofore
convinced nobody (but me, it seems, a sad commentary on the state of
neuro-science IMO). However, recently, experimental evidence has arisen to
support this wonderful idea. I’ve mentioned before some tantalizing bits of
evidence (here
and here)
in favor of this view. Randy reviews some new experimental stuff that quite
directly supports his basic idea. He discusses three recent experiments, the
most impressive (to me) being work in Norway Sweden (A correction: my mistake and apologies to Nordics everywhere) that shows that the acquired
eyeblink response in ferrets “resides within individual Purkinje cells in the
cerebellar conrtext” (7). Randy describes their paper (not an easy read for a
tyro like me) in detail, as well as two other relevant experiments. It’s a bit
of a slog, but given how important the results are, it is worth the effort.
In the popular imagination science is thought of as the
careful accumulation of information leading to the development of ever more
refined theories. This is not entirely
false, but IMO, it is very misleading. Data is useful in testing and
elaborating ideas. But, we need the ideas. And, surprisingly, sometimes some
hard thinking about a problem can lead to striking novel ideas, and even the
dislodging of well-entrenched conceptions.
Chomsky did this in his review of Skinner and subsequent development of
generative grammar. Randy’s suggestion that thinking is effectively
DNA/RNA/Protein manipulation that takes place within the single cell does this
for neuro-science building on the Fodor/Chomsky conception of the mind as a
computational symbolic processing engine.
Such a conception demands a
Turing like brain architecture. And, it
seems, that nature has provided such at the molecular level. DNA/RNA/proteins
can do this sort of stuff. Amazingly, we are starting to find evidence that
what should be the case, indeed actually
is the case. This is scientific
thinking at its most exciting. Read the paper and enjoy.
[1]
This points bears repeating: what Fodor&Pylyshyn and Marcus argued
effectively is that connectionist models of
the mind were deeply inadequate. They did not argue against such models as
brain models (i.e. on the implementation level). Randy extends this criticism
to implementational level: not only is connectionism a bad model of the mind,
it is also a bad model of the brain. Interestingly, it is a bad model of the
brain precisely because it cannot realize the architecture needed to make it a
good model of the mind.
[2]
If correct, this is an especially powerful argument. It is one thing not to get
what others think important, quite another to not even gain much of what you think is so. This is what marks
great critiques like Chomsky’s of Skinner, Fodor&Pylyshyn’s and Marcus’ of
connectionist mental architectures and Gallistel’s of connectionist models of
the brain.
[3]
Randy’s criticisms of neural nets as units of computation invites the strong
reading that computation only goes on
intra-neuronally. But this need not be correct for Randy’s main point to be
valid, viz. that lots of computing
relevant to cognition goes on inside the neuron, even if some may take place inter-neuronally.
One interesting trend among the folks who you might call "3rd generation connectionists” (i.e., today’s deep learning coterie) is how much they embrace the computational impossible objections that Gallistel (and Fodor and Pylyshyn and ...) rightly criticize computation-as-association for having. If you want to compute interesting things and be able to learn to make the right kind of generalizations, the traditional connectionist model is simply not appropriate. Thus, you have papers like this overhyped one, I suspect wouldn’t be symbolic enough to pacify true Fodorians, but which still represents a very different computational beast than what Gallistel et al. are criticizing. Admittedly, I don't know if many of these guys are thinking seriously about cog sci—they're too busy spending the big bags of money Google/Facebook/Microsoft gives them—but the (re)volution is worth noting. And it is an admission on the part of the connectionists (even if it’s just the “applied” ones) that there is more to life than synaptic weights.
ReplyDeleteThis brings me to my second point that makes me question whether this is truly the connectionist death-knell. Whereas Gallistel rightly objects that old-school connectionists "hand-wave" about associationist (non-)solutions to hard information processing problems, it seems to me that new-school symbolic types are equally guilty about hand-waving when it comes to learning (if you think I'm mistaken here, I would welcome any citations). Learning by risk minimization (i.e., what back-propagation does, and other algorithms do too) is at least as old—and probably way older—than the symbol processing system found in RNA/DNA. So to my mind, we are at an impasse: we either have an impoverished computational system, or an impoverished learning system. Perhaps due to my formative years at Maryland, I’m actually pretty happy to work with impoverished computational systems (it’s just another reduction of the hypothesis space!), but I’m loathe to give up on formal models of learning. Taste and intuition may guide others in other directions, but eh, it’s a big world.
@Chris:
DeleteWho is asking you to give up on "learning"? Randy is just asking that you see what things are being learned and understand what this presupposes. If we learn IS intervals (numbers representing the time between CS onset and US) then we need a mechanism to do this. We need ways of coding numbers in memory that can then be retrieved and further computed with. Does this mean that we don't also need a story about why it takes lots of exposures to the CS/US pair to do this? No. But it does mean that the knowledge attained is symbolic and this has consequences for both mind and brain architecture.
The trick, as Gallistel notes in his book with King, is to combine symbolic systems with probabilities. He is a Bayesian as he thinks that this is where the two can meet. I am not expert enough to have an opinion. But form where I sit IF he is right, that's fine with me. So, you won't find many symbolic types denying that probabilities matter, whereas you do find many connectionists denying that mentation invokes symbolic computation. Or so it seems to me.
I haven't talked to too many connectionists directly, but there are definitely proponents of embodied cognition who deny computation and representation:
Deletehttp://www.talkingbrains.org/2014/10/embodied-or-symbolic-who-cares.html
I have a feeling that the connectionists and the embodied folks hang out at conferences and have drinks together. To be honest, though, it seems like everyone in cog neuro if a connectionist even if they believe in representation and computation. Which means they need to read G&K's 2009 book and this article and understand the arguments presented there.
This comment has been removed by the author.
Delete@William, I'm confused by the claim "everyone in cog neuro [is] a connectionist even if they believe in representation and computation." If you have a "connectionist" model that has computation, representation, memory, and indirection, in what sense do G&K's arguments against connectionism hold? I'll grant you that the stuff I'm referring to is new, fairly marginal stuff (although as I see antecedents of it going back 10 or 15 years, maybe longer), and I'm reading just in the ML literature (not the psych/neuro/cogsci lit), but it seems like many of the requirements for G&K's computational primitives of mentation do exist in these architectures. (I will also confess I haven't read G&K since it was a very early pre-pub draft, but I've ordered the book, so I'll catch up soon).
DeleteThe comment wasn't intended as a claim; it was an observation - all of the people I know in cognitive neuroscience also happen to like connectionist modeling, in the common parlance in which 'connectionism' is used, to refer to activation values and weights as the relevant parameters.
DeleteThere's a line of logic that I've been think about more and more as I have read this blog that I'd like to test out. Please point out any flaws.
ReplyDelete1) Optimality theory is based on a connectionist theory of cognition.
2) Connectionist theories of cognition are based on synaptic theories of neuroscience.
3) Synaptic theories of neuroscience seem to be wrong.
4) Therefore, Connectionist theories are baseless.
5) Therefore, OT is baseless.
Obviously this is far too brief to be convincing to most OT practitioners, and as a syntax-semantics guy I don't really spend that much time in the OT literature, so I am probably oversimplifying ...
Not sure about premiss (1). Does "based on" mean "require" or "is consistent with." If the latter, then the argument seems overdrawn. I am not an OTer, but from the little I know, I don't see that the strong assumption is required.
DeleteMy sense is that it might be somewhere between a requirement and consistency with. Prince and Smolensky, definitely tout the consistency with connectionism as a selling feature for OT. It's probably OTs main theoretical "advantage" over rule-based approaches.
DeleteSo premiss (1) is a hypothesis of sorts
Well, if Gallistel is right then it's not a great advantage, and maybe even worse.
Delete