Sunday, November 20, 2016

Revisiting Gallistel's conjecture

I recently received two papers that explore Gallistel’s conjecture (see here for one discussion) concerning the locus of neuronal computation. The first (here) is a short paper that summarizes Randy’s arguments and suggests a novel view of synaptic plasticity. The second (here:[1] accept Randy’s primary criticism of neural nets and couples a neural net architecture with a pretty standard external memory system. Let me say a word about each.

The first paper is by Patrick Trettenbrein (PT) and it appears in Frontiers in Systems Neuroscience. It does three things.

First, it reviews the evidence against the idea that brains store information in their “connectivity profiles” (2). This is the classical assumption that inter-neural connection strengths are the locus of information storage. The neurophysiological mechanisms for this are long term potentiation (LTP) and long term depression (LTD). LTP/D are the technical terms for whatever strengthens or weakens interneuron connections/linkages. I’ve discussed Gallistel and Matzel’s (G&M) critique of the LTP/D mechanisms before (see here). PT reviews these again and emphasizes G&M’s point that there is an intimate connection between this Hebbian “fire together wire together” LTP/D based conception of memory and associationist psychology. As PT puts it: “Crucially, it is only against this background of association learning that LTP and LTD seem to provide a neurobiologically as well as psychologically plausible mechanism for learning and memory” (88). This is why if you reject associationsim and endorse “classical cognitive science” and its “information processing approach to the study of the mind/brain” you will be inclined to find contemporary connectionist conceptions of the brain wanting (3).

Second, there is recent evidence that connection strength cannot be the whole story. PT reviews the main evidence. It revolves around retaining memory traces despite very significant alterations in connectivity profiles. So, for example, “memories appear to persist in cell bodies and can be restored after synapses have been eliminated” (3), which would be odd if memories lived in the synaptic connections. Similarly it has recently been shown that “changes in synaptic strength are not directly related to storage of new information in memory” (3). Finally, and I like this one the best (PT describes it as “the most challenging to the idea that the synapse is the locus of memory in the brain”), PT quotes a 2015 paper by Bizzi and Ajemian which makes the following point:

If we believe that memories are made of patterns of synaptic connections sculpted by experience, and if we know, behaviorally, that motor memories last a lifetime, then how can we explain the fact that individual synaptic spines are constantly turning over and that aggregate synaptic strengths are constantly fluctuating?

Third, PT offers a reconceptualization of the role these neural connections. Here’s an extended quote (5):

…it occurs to me that we should seriously consider the possibility that the observable changes in synaptic weights and connectivity might not so much constitute the very basis of learning as they are the result of learning.

This is to say that once we accept the conjecture of Gallistel and collaborators that the study of learning can and should be separated from the study of memory to a certain extent, we can reinterpret synaptic plasticity as the brain's way of ensuring a connectivity and activity pattern that is efficient and appropriate to environmental and internal requirements within physical and developmental constraints. Consequently, synaptic plasticity might be understood as a means of regulating behavior (i.e., activity and connectivity patterns) only after learning has already occurred. In other words, synaptic weights and connections are altered after relevant information has already been extracted from the environment and stored in memory.

This leaves a place for connectivity, but not as the mechanism of memory but as what allows memories to be efficiently exploited.[2] Memories live within the cell but putting these to good use requires connections to other parts of the brain where other cells store other memories. That’s the basic idea. Or as PT puts it (6):

The role of synaptic plasticity thus changes from providing the fundamental memory mechanism to providing the brain’s way of ensuring that its wiring diagram enables it to operate efficiently…

As PT notes, the Gallistel conjecture and his tentative proposal are speculative as theories of the relevant cell internal mechanisms don’t currently exist. That said, neuroiphsyiological (and computational, see below) evidence against the classical Hebbian view are mounting and the serious problems for storing memories in usable form in connections strengths (the bases of Gallistel’s critique) are becoming more and more well recognized.

This brings us to the second Nature paper noted above. It endorses the Gallistel critique of neural nets and recognizes that neural net architectures are poor ways of encoding memories. It adds a conventional RAM to a neural net and this combination allows the machine to “represent and manipulate complex data structures.”

Artificial neural networks are remarkably adept at sensory processing, sequence learning and reinforcement learning, but are limited in their ability to represent variables and data structures and to store data over long timescales, owing to the lack of an external memory. Here we introduce a machine learning model called a differentiable neural computer (DNC), which consists of a neural network that can read from and write to an external memory matrix, analogous to the random-access memory in a conventional computer. Like a conventional computer, it can use its memory to represent and manipulate complex data structures, but, like a neural network, it can learn to do so from data.

Note that the system is still “associationist” in that learning is largely data driven (and as such will necessarily run into PoS problems when applied to any interesting cognitive domain like language) but it at least recognizes that neural nets are not good for storing information. This latter is Randy’s point. The paper is significant for it comes from Google’s Deep Mind Project and this means that Randy’s general observations are making intellectual inroads with important groups. Good.

However, this said, these models are not cognitively realistic for they still don’t make room for the domain specific knowledge that we know characterizes (and structures) different domains. The main problem remains the associationism that the Google model puts at the center of the system. As we know that associationism is wrong and that real brains characterize knowledge independently of the “input,” we can be sure that this hybrid model will need serious revision if intended as a good cog-neuro model.

Let me put this another way. Classical cog sci rests on the assumption that representations are central to understanding cognition. Fodor and Pylyshyn and Marcus long ago agued convincingly that connectionism did not successfully accommodate representations (and, recall, that connectionist agreed that their theories dumped representations) and that this was a serious problem for connectionist/neural net architectures. Gallistel further argued that neural nets were poor models of the brain (i.e. and not only of the mind) because they embody a wrong concpetion of memory; one that that makes it hard to read/write/retrieve complex information (data structures) in usable form. This, Gallistel noted, starkly contrasts with more classical architectures. The combined Fodor-Pylyshyn-Marcus-Gallistel critique then is that connectionist/neural net theories were a wrong turn because they effectively eschewed representations and that this is a problem both from the cognitive and the neuro perspective. The Google Nature paper effectively concedes this point, recognizes that representations (i.e. “complex data structures) are critical  and resolves the problem by adding a classical RAM to a connectionist front end.

However, there is a second feature of most connectionist approaches that is also wrong. Most such architectures are associationist. They embody the idea that brains are entirely structured by the properties of the inputs to the system. As PT puts it (2):

Associationism has come in different flavors since the days of Skinner, but they all share the fundamental aversion toward internally adding structure to contingencies in the world (Gallistel and Matzel 2013).

Yes! Connectionists are weirdly attracted to associationism as well as rejecting representations. This is probably not that surprising. Once on thinks of representations then it quickly becomes clear that many of their properties are not reducible to statistical properties of the inputs. Representations have formal properties above and beyond what one finds in the input, which, once you look, are found to be causally efficacious. However, strictly speaking associationsim and anti-representationalism are independent dimensions. What makes Behaviorists distinctive among Empiricists is their rejection of representations. What unifies all Empiricists is their endorsement of associationism. Seen form this perspective, Gallistel and Fodor and Pylyshyn and Marcus have been arguing that representations are critical. The Google paper agrees. This still leaves associationism however, and position the Googlers embrace.[3]

So is this a step forward? Yes. It would be a big step forward if the information processing/representational model of the mind/brain became the accepted view of things, especially in the brain sciences. We could then concentrate (yet again) all of our fire on pernicious Empiricism so many Cog-neuro types embrace.[4] But, little steps my friends, little steps. This is a victory of sorts. Better to be arguing against Locke and Hume than Skinner![5]

That’s it. Take a look.

[1] Thx to Chris Dyer for bringing the paper to my attention. I put in the URL up rather than link to the paper directly as the linking did not seem to work. Sorry.
[2] Redolent of a competence/performance distinction, isn’t it?  The physiological bases of memory should not be confused with the physical bases for the deployment of memory.
[3] I should add that it is not clear that the Googlers care much about the cog-neuro issues. Their concerns are largely technological, it seems to me. They live in a Big Data world, not one where PoS problems (are thought to) abound. IMO, even in a uuuuuuge data environment, PoS issues will arise, though finding them will take more cleverness. At any rate, my remarks apply to the Google model as if intended as a cog-neuro one.
[4] And remember, as Gallistel notes (and PT emphasizes) much of the connectionism one sees in the brain sciences rests on thinking that the physiology has a natural associationist interpretation psychologically. So, if we knock out one strut, the other may be easier to dislodge as well (I know that this is wishful thinking btw).
[5] As usual, my thinking on these issues was provoked by some comments by Bob Berwick. Thx.


  1. I suspect the DeepMind team chose what architectures to use based on what worked better in practice rather than based on philosophical arguments or neural plausibility; I'm not sure they have highly articulated views about connectionism or the merits of Fodor & Pylyshyn's arguments.

    Another comment: I'm not sure I understand your definition of associationism. If the position is that "brains are entirely structured by the properties of the inputs to the system", as you wrote, that doesn't seem to apply to the DeepMind system, which has quite a bit of structure built into it (or to any contemporary neural network, really).

    1. I just noticed that one of your footnotes makes a similar point to my first paragraph. Oops.

  2. It seems to me that one kinds of connectionist/Hebbian associative network, specifically the Hopfield network, is able to serve as a model of memory. In a previous lifetime, I even proved some results about the storage capacity of such networks which does reside in the synaptic weights. Now these are artificial computing systems; actual memory systems may work differently.

    I once tried to talk to Randy about this matter but we got quickly sidetracked by more interesting questions.

    1. I'd agree that the Hopfield network can, in principle, serve as a model of memory; though it seems to me to only somewhat awkwardly mimic a content-addressable memory. But, as you said, this is a purely theoretical exercise, as it is quite evident that the properties of such networks (e.g., speed, energy consumption, etc.) are untenable neurophysiologically.

    2. Patrick, could you expand on that comment? What speed do you mean .. learning speed or retrieval speed or ..? And what are the energy consumption issues?

  3. Hi Alex, I’m sorry for replying so late, but the notification e-mail about your reply got caught up in spam and I only stumbled upon it right now by sheer accidence.

    Well, I was referring to the general slowness of such an implementation in neural tissue. Signal transmission is incredibly slow in nervous systems as opposed to conventional computers. Such network implementations require a vast number of neurons that spike frequently (recurrently); but this then of course also ultimately concerns speed of retrieval (whereas that is not a clearly defined notion in this context). Regarding energy consumption: Clearly, computing with neurons and their spikes requires energy, whereas the brain already consumes about 20 % of total energy, about half of which is used for signalling. I cannot reproduce the numbers off of the top of my head, but there’s work along these lines showing that the amount of energy consumed by such networks if implemented using real neurons is way too high. I remember that Gallistel & King also have a chapter on this issue (and related ones) in their book, where I think they also point to the fact that ventilation would quickly become a problem as well.