I recently received two papers that explore Gallistel’s
conjecture (see here
for one discussion) concerning the locus of neuronal computation. The first (here)
is a short paper that summarizes Randy’s arguments and suggests a novel view of
synaptic plasticity. The second (here: http://www.nature.com/nature/journal/v538/n7626/full/nature20101.html)[1]
accept Randy’s primary criticism of neural nets and couples a neural net
architecture with a pretty standard external memory system. Let me say a word
about each.
The first paper is by Patrick Trettenbrein (PT) and it
appears in Frontiers in Systems Neuroscience.
It does three things.
First, it reviews the evidence against the idea that brains
store information in their “connectivity profiles” (2). This is the classical
assumption that inter-neural connection strengths are the locus of information
storage. The neurophysiological mechanisms for this are long term potentiation
(LTP) and long term depression (LTD). LTP/D are the technical terms for
whatever strengthens or weakens interneuron connections/linkages. I’ve
discussed Gallistel and Matzel’s (G&M) critique of the LTP/D mechanisms
before (see here).
PT reviews these again and emphasizes G&M’s point that there is an intimate
connection between this Hebbian “fire together wire together” LTP/D based
conception of memory and associationist psychology. As PT puts it: “Crucially,
it is only against this background of association learning that LTP and LTD
seem to provide a neurobiologically as well as psychologically plausible
mechanism for learning and memory” (88). This is why if you reject
associationsim and endorse “classical cognitive science” and its “information
processing approach to the study of the mind/brain” you will be inclined to find
contemporary connectionist conceptions of the brain wanting (3).
Second, there is recent evidence that connection strength
cannot be the whole story. PT reviews the main evidence. It revolves around
retaining memory traces despite very significant alterations in connectivity
profiles. So, for example, “memories appear to persist in cell bodies and can
be restored after synapses have been eliminated” (3), which would be odd if
memories lived in the synaptic connections. Similarly it has recently been shown
that “changes in synaptic strength are not directly related to storage of new
information in memory” (3). Finally, and I like this one the best (PT describes
it as “the most challenging to the idea that the synapse is the locus of memory
in the brain”), PT quotes a 2015 paper by Bizzi and Ajemian which makes the
following point:
If we believe that memories are
made of patterns of synaptic connections sculpted by experience, and if we
know, behaviorally, that motor memories last a lifetime, then how can we
explain the fact that individual synaptic spines are constantly turning over
and that aggregate synaptic strengths are constantly fluctuating?
Third, PT offers a reconceptualization of the role these
neural connections. Here’s an extended quote (5):
…it occurs to me that we
should seriously consider the possibility that the observable changes in
synaptic weights and connectivity might not so much constitute the very basis
of learning as they are the result of learning.
This is
to say that once we accept the conjecture of Gallistel and collaborators that
the study of learning can and should be separated from the study of memory to a
certain extent, we can reinterpret synaptic plasticity as the brain's way of
ensuring a connectivity and activity pattern that is efficient and appropriate
to environmental and internal requirements within physical and developmental
constraints. Consequently, synaptic plasticity might be understood as a means
of regulating behavior (i.e., activity and connectivity patterns) only after
learning has already occurred. In other words, synaptic weights and connections
are altered after relevant information has already been extracted from the
environment and stored in memory.
This leaves a place for
connectivity, but not as the mechanism of memory but as what allows memories to
be efficiently exploited.[2] Memories live within the
cell but putting these to good use requires connections to other parts of the
brain where other cells store other memories. That’s the basic idea. Or as PT puts
it (6):
The
role of synaptic plasticity thus changes from providing the fundamental memory
mechanism to providing the brain’s way of ensuring that its wiring diagram
enables it to operate efficiently…
As PT notes, the Gallistel
conjecture and his tentative proposal are speculative as theories of the relevant
cell internal mechanisms don’t currently exist. That said, neuroiphsyiological
(and computational, see below) evidence against
the classical Hebbian view are mounting and the serious problems for storing
memories in usable form in connections strengths (the bases of Gallistel’s
critique) are becoming more and more well recognized.
This brings us to the second Nature paper noted above. It endorses
the Gallistel critique of neural nets and recognizes that neural net
architectures are poor ways of encoding memories. It adds a conventional RAM to
a neural net and this combination allows the machine to “represent and
manipulate complex data structures.”
Artificial
neural networks are remarkably adept at sensory processing, sequence learning
and reinforcement learning, but are limited in their ability to represent
variables and data structures and to store data over long timescales, owing to
the lack of an external memory. Here we introduce a machine learning model
called a differentiable neural computer (DNC), which consists of a neural
network that can read from and write to an external memory matrix, analogous to
the random-access memory in a conventional computer. Like a conventional
computer, it can use its memory to represent and manipulate complex data
structures, but, like a neural network, it can learn to do so from data.
Note that the system is still “associationist” in that
learning is largely data driven (and as such will necessarily run into PoS
problems when applied to any interesting cognitive domain like language) but it
at least recognizes that neural nets
are not good for storing information. This latter is Randy’s point. The paper
is significant for it comes from Google’s Deep Mind Project and this means that
Randy’s general observations are making intellectual inroads with important
groups. Good.
However, this said, these models are not cognitively
realistic for they still don’t make room for the domain specific knowledge that
we know characterizes (and structures)
different domains. The main problem remains the associationism that the Google
model puts at the center of the system. As we know that associationism is wrong
and that real brains characterize knowledge independently of the “input,” we
can be sure that this hybrid model will need serious revision if intended as a
good cog-neuro model.
Let me put this another way. Classical cog sci rests on the
assumption that representations are central to understanding cognition. Fodor
and Pylyshyn and Marcus long ago agued convincingly that connectionism did not
successfully accommodate representations (and, recall, that connectionist agreed that their theories dumped
representations) and that this was a serious problem for connectionist/neural
net architectures. Gallistel further argued that neural nets were poor models
of the brain (i.e. and not only of the mind) because they embody a wrong
concpetion of memory; one that that makes it hard to read/write/retrieve
complex information (data structures) in usable form. This, Gallistel noted,
starkly contrasts with more classical architectures. The combined Fodor-Pylyshyn-Marcus-Gallistel
critique then is that connectionist/neural net theories were a wrong turn
because they effectively eschewed representations
and that this is a problem both from the cognitive and the neuro perspective.
The Google Nature paper effectively
concedes this point, recognizes that representations (i.e. “complex data
structures) are critical and resolves
the problem by adding a classical RAM to a connectionist front end.
However, there is a second feature of most connectionist approaches that is also wrong. Most such
architectures are associationist. They embody the idea that brains are entirely
structured by the properties of the inputs to the system. As PT puts it (2):
Associationism has come in
different flavors since the days of Skinner, but they all share the fundamental
aversion toward internally adding structure to contingencies in the world
(Gallistel and Matzel 2013).
Yes! Connectionists are weirdly attracted to associationism
as well as rejecting representations. This is probably not that surprising.
Once on thinks of representations then it quickly becomes clear that many of
their properties are not reducible to statistical properties of the inputs.
Representations have formal properties above and beyond what one finds in the
input, which, once you look, are found to be causally efficacious. However,
strictly speaking associationsim and anti-representationalism are independent
dimensions. What makes Behaviorists distinctive among Empiricists is their
rejection of representations. What unifies all Empiricists is their endorsement
of associationism. Seen form this perspective, Gallistel and Fodor and Pylyshyn
and Marcus have been arguing that representations are critical. The Google paper
agrees. This still leaves associationism however, and position the Googlers embrace.[3]
So is this a step forward? Yes. It would be a big step
forward if the information processing/representational model of the mind/brain
became the accepted view of things, especially in the brain sciences. We could
then concentrate (yet again) all of our fire on pernicious Empiricism so many
Cog-neuro types embrace.[4]
But, little steps my friends, little steps. This is a victory of sorts. Better
to be arguing against Locke and Hume than Skinner![5]
That’s it. Take a look.
[1]
Thx to Chris Dyer for bringing the paper to my attention. I put in the URL up
rather than link to the paper directly as the linking did not seem to work.
Sorry.
[2]
Redolent of a competence/performance distinction, isn’t it? The physiological bases of memory should not
be confused with the physical bases for the deployment of memory.
[3]
I should add that it is not clear that the Googlers care much about the
cog-neuro issues. Their concerns are largely technological, it seems to me. They
live in a Big Data world, not one where PoS problems (are thought to) abound.
IMO, even in a uuuuuuge data environment, PoS issues will arise, though finding
them will take more cleverness. At any rate, my remarks apply to the Google
model as if intended as a cog-neuro
one.
[4]
And remember, as Gallistel notes (and PT emphasizes) much of the connectionism
one sees in the brain sciences rests on thinking that the physiology has a
natural associationist interpretation psychologically. So, if we knock out one
strut, the other may be easier to dislodge as well (I know that this is wishful
thinking btw).
[5]
As usual, my thinking on these issues was provoked by some comments by Bob
Berwick. Thx.
I suspect the DeepMind team chose what architectures to use based on what worked better in practice rather than based on philosophical arguments or neural plausibility; I'm not sure they have highly articulated views about connectionism or the merits of Fodor & Pylyshyn's arguments.
ReplyDeleteAnother comment: I'm not sure I understand your definition of associationism. If the position is that "brains are entirely structured by the properties of the inputs to the system", as you wrote, that doesn't seem to apply to the DeepMind system, which has quite a bit of structure built into it (or to any contemporary neural network, really).
I just noticed that one of your footnotes makes a similar point to my first paragraph. Oops.
DeleteIt seems to me that one kinds of connectionist/Hebbian associative network, specifically the Hopfield network, is able to serve as a model of memory. In a previous lifetime, I even proved some results about the storage capacity of such networks which does reside in the synaptic weights. Now these are artificial computing systems; actual memory systems may work differently.
ReplyDeleteI once tried to talk to Randy about this matter but we got quickly sidetracked by more interesting questions.
I'd agree that the Hopfield network can, in principle, serve as a model of memory; though it seems to me to only somewhat awkwardly mimic a content-addressable memory. But, as you said, this is a purely theoretical exercise, as it is quite evident that the properties of such networks (e.g., speed, energy consumption, etc.) are untenable neurophysiologically.
DeletePatrick, could you expand on that comment? What speed do you mean .. learning speed or retrieval speed or ..? And what are the energy consumption issues?
DeleteHi Alex, I’m sorry for replying so late, but the notification e-mail about your reply got caught up in spam and I only stumbled upon it right now by sheer accidence.
ReplyDeleteWell, I was referring to the general slowness of such an implementation in neural tissue. Signal transmission is incredibly slow in nervous systems as opposed to conventional computers. Such network implementations require a vast number of neurons that spike frequently (recurrently); but this then of course also ultimately concerns speed of retrieval (whereas that is not a clearly defined notion in this context). Regarding energy consumption: Clearly, computing with neurons and their spikes requires energy, whereas the brain already consumes about 20 % of total energy, about half of which is used for signalling. I cannot reproduce the numbers off of the top of my head, but there’s work along these lines showing that the amount of energy consumed by such networks if implemented using real neurons is way too high. I remember that Gallistel & King also have a chapter on this issue (and related ones) in their book, where I think they also point to the fact that ventilation would quickly become a problem as well.