Friday, May 11, 2018

Ideas that break the mold

I am currently re-reading a terrific book on the history of modern molecular biology called The Eight Day of Creation (here, henceforth 8-day). The book reviews some of the seminal scientific events in modern biology, starting with Watson and Crick’s discovery of the double helix structure for DNA. The book is really fun to read given that it intersperses serious science with lots of titillating gossip about the relevant personalities.

The fun aside, the book (confession: I’ve read the first 200 pages so far and this deals exclusively with DNA) raises two interesting questions for someone like me. 

First, it seems to point to two kinds of “revolutions” in the sciences. The first kind is one that everyone is waiting to happen and that had the work that fomented it not been done, analogous work would soon have been produced making an analogous intellectual contribution. The second kind of work is the opposite: had the people who did it not been around, then nobody else would have done it (or at least not soon). Rather, the idea’s birth would have been long (maybe perpetually) delayed. Both kinds of work are groundbreaking and deserving of the kudos and prizes heaped upon it. The difference is that the discoverers of the first kind are distinguished by breaking the tape a bit ahead of others, while the latter is distinguished by having only one person running the race at all.

The second question, of course, is whether any work currently being pursued in my extended neck of the woods smells like either one of these. What is the next big idea? Needless to say, the first kind will be easier to sniff out than the second given that the second seems to come out of nowhere. But, I suspect that nothing really comes completely out of nowhere and I will suggest that one idea that we have been tracking in FoL that has been treated as scientifically dubious until now is gaining traction so that it is beginning to look like an idea whose time has come. In other words, if we take the progression from Ridiculous! to Obvious! via Sorta/Maybe! as an early indicator of an intellectual revolution, then I think the Gallistel-King conjecture (GKC) is about to enjoy some quality time in the intellectual sun.

It should go without saying (but I will say it nonetheless) that everything I say in what follows is entirely half-assed and speculative. This partially comes with the subject matter. But as I like these sorts of issues, and cannot resist, and have nothing better to talk about at the moment, I will indulge myself. You need not follow.

Let’s start with the two kinds of revolutions. 8-day makes the case (not deliberately, I should add) that the helix was waiting to be discovered and though Watson and Crick got their first, someone else would have grocked the structure very soon if they had not. Likely candidates include Wilkins, Franklin and, almost certainly Pauling. There were probably others around that could have figured out the basic ideas as well (or so 8-day leads me to believe).  In fact, Crick seems to agree with this assessment (see 8-day:155). I do not intend this observation to denigrate the achievement (more exactly: who the hell am I to be able to denigrate it?), just to note that it seems to be an idea whose time had arrived. Many researchers thought that DNA was the important big molecule to chemically understand. They thought this because they knew that it was the repository of hereditary information. Many thought that it was some sort of helix and many thought that X-ray pictures were the right kind of evidence to probe their structure. There were several mathematical accounts available (albeit imperfect) to argue from pictures to structure and it seems from the story 8-day tells that sooner or later the story would be cracked (maybe in dribs and drabs as Crick notes that Medawar suggested in the quoted note on 8-day p. 155).

One could say something similar for other great discoveries. Einstein’s theory of special relativity was very similar to other theories that cropped up at the time (Lorentz, Poincare), Darwin’s theory of natural selection was simultaneously discovered by Wallace. The same appears to be true of the work on QED in more modern times. Again, all of this stuff is great, but it was stuff that seems to have been “in the air” and was something that someone would have discovered pretty soon after whoever is credited with the work did it.[1]

This contrasts with other kinds of discoveries. I am told that Einstein’s General Theory is something that really arrived unexpectedly and that nobody was working along the same lines. Ditto with Mendelian genetics (which was so far ahead of its time that it lay undiscovered for about 35-50 years till it was rediscovered by others (Morgan)). McClintock’s theory of jumping genes might fit in here too from what I know of it as would Marshall and Warren’s theory of the bacterial origins of ulcers (which the rest of the scientific community scoffed at until it received the Nobel). 

To this list, I would add Chomsky’s discovery that humans have an FL built to acquire and use recursive Gs with distinctive computational properties. This is an idea, which though obviously correct is still resisted in many quarters. From my read of the history, it seems clear that had Chomsky not made the case for Generative Grammar nobody would have made it for many years to come (if ever, if current resistance is any indication).

There is one more idea that is coming into its own that I would add to the list, and that brings us to the second question (i.e. anything like this on the horizon now?): Gallistel’s conjecture that human cognitive computation is intra-neuronal and and a species of chemical computation rather than inter-neuronal and “connectionist.” This idea has been roundly resisted (and dismissed) by most of the cog-neuro community. The idea that brain computations are not “like” classical computing at all (no registers, variables, write-to and read-from memory etc.) is a virtual dogma in the neurosciences (and has been for well over 30 years). Neo-connectionism is the name of the cog-neuro game and Gallistel’s critiques have been largely ignored and his more positive proposals barely attended to.  Until recently.

I have noted several recentish studies that have argued that there is (at least) some intra-neuronal calculations that cells do (type “Gallistel-King conjecture” into the find box on the top left corner for posts on the topic). Another one has just appeared in Science(here). The authors  are Tagkopoulos, Liu, and Tavazoie (TLT). They show how the e-coli are capable of “forming internal representations that allow prediction of environmental change.” They do this using “intracellular networks” of “biochemical reactions.” Using these networks, these single cell microbes “form internal representations of their dynamic environments that enable predictive behavior.” Further, consistent with the Gallistel-King conjecture (GKC), it appears that these biochemical representations consist of “genome wide transcriptional responses” based on the DNA-RNA-Protein system characteristic of modern cellular bio-chemistry. 

I am no expert in these matters, but it sure looks like what TLT is finding comports quite nicely with the most straightforward version of the GKC in which cognitive computation is based in the same kinds of processes and networks used to convey hereditary information. First, both take place withinsingle cells. Second the information processing has a pretty classical look and embodies a computational architecture (as discussed in detail in The Gallistel & King book) exploiting DNA/RNA/Proteins in the way GKC initially proposed. Not bad for armchair theorizing. Not bad at all.

As I mentioned, there is more and more stuff coming out that provides empirical support for this big idea. And as I have also mentioned elsewhere, this is roughly what we should expect. The GKC is the conservativehypothesis concerning cognitive computation, despite its also being iconoclastic. It claims that cognition supervenes on an information processing network that we know that cells have and that is used for another purpose (passing traits onto future generations). This system is computationally very rich (it embodies a classical (Turing/von Neumann) computational architecture) as Gallsitel and King show. GKC makes the intellectually conservative proposal that an in placeinformation processing network (aan extant system that passes genetic information across generational time) is also used (or repurposed) for other kinds of info processing tasks (i.e. cognitive information processing). This is standard Darwinian thinking. 

In contrast connectionism is quite radical as it proposes a novel computational apparatus to do the heavy cognitive lifting, bypassing a perfectly respectable extant in place and up and running system. Of course, this might be what happened, but it is still a very radical proposal and should only be accepted if there is very significant evidence in its favor. And as Gallistel has argued, there is really no good evidence to support it and lots of problems with it. I will not rehearse these here (but see here), except to say that it is quite amazing how a bad idea gains staying power if it leverages another really bad idea. The marriage of connectionism and associationism is one such stable couple as Gallistel has shown and the fact the neither is convincing on its own seems not to have convinced the neuro-cognoscenti to dumb the pair.

It is fun to speculate just how game changing a world that accepts GKC would be. Cogneuro could really start stealing liberally from our biological friends. Learning would be to cognition what development is to biology (the building of forms based on genetic information plus environmental inputs). All that inter-neuronal chatter might be re-analyzed as sharing computational results rather than executing actual computations. One could imagine a kind of neuronal wisdom of crowds kind of system where individual neurons compute and then “vote” with the popular favorite output carrying the day. But all of this is realfancifulspeculation, completely unmoored from any knowledge (my specialty!). The important point is that it’s looking more and more like GKC is onto something and if it turns out to be even roughly correct, the consequences for what we do in the cog-neuro sciences will be profound. Why do I think this? Because it’s what happened in biology. Indeed, it’s  the big moral from 8-day. Let me explain.

As I said, 8-day is a terrific read and spurs endless fun speculation. It also carries a moral for linguists (and psychologists) with a cognitive bent. To wit: The intellectual challenge facing people in the mid 50s as regards finding the structure of DNA is quite analogous to the central problems in cog-neuro today. The problem then was to find a way of physically grounding the gene. The problem was usefully bounded by the fact that it had to be a structure that comported with the insights of Mendelian genetics (in particular the fact that reproduction leads to half of the genetic traits of the parents being passed onto the offspring). The intellectual challenge was to find a physical structure that would make clear how this was possible. The Watson-Crick structure for DNA did this in a beautiful way. It showed how Mendel’s genetics could be incarnated. Thus, Mendel’s insights formed a boundary condition on the structure of whatever it was that served to transmit hereditary information. The helical structure of DNA did this almost perfectly and Watson and Crick noted as much in their original paper. Here’s the money quote (8-day:154):

It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material. 

We find ourselves in a similar situation today. We know a great deal about parts of cognition. We know a lot about some of the computational properties of cognition. We know that this requires representations with complex properties that demand something very like a classical computational architecture. We know that something like innate cognitive knowledge exists that allows for the kinds of cognitive computations biological systems perform. The goal of neuroscience should be to figure out how this is incarnated in biological material: e.g. What’s an address? How do you read-from and write-to the system, what’s a variable? What’s a pointer to an address? How do you store a number in memory? How is “innate” knowledge genetically coded. These are all things we understand how to execute in silicon. The cognitive theories we have tell us that our embodied computational system musthave these kinds of structures and operations as well. The neuro question is how this is embodied in biological material (as opposed to silicon)? The GKC builds on the fact that though we know how this could be done using intra-neuronal chemistry and have no idea how this could be done using inter-neuronal connections. The obvious conclusion is that cognitive computation is physically grounded in intra-neuronal chemistry. Amazingly, we are starting to get some details about how this might done. And nobody would have thunk it when the GKC was first mooted. 

[1]That said, the various “versions” had different virtues. So Einstein’s theory of special relativity was substantially different from Lorentz’s and the ways it was different mattered. So too the various versions of QED. Feynman’s formulation spread quickly to the community because of its intuitive appeal. Schwinger’s (I am told) was no less adequate but it was far more technically challenging and harder to conceptualize. These are not small differences, but the general point stands: the basic analyses were very similar and had Einstein not come up with his theory or Feynman with his someone would have come up with a working version that the community would have embraced.


  1. Interesting -- isn't there a correlation between the two kinds of breakthroughs and how easy it is to see whether they're correct? The first-through-the-tape kind of breakthrough should also be one that would be more or less instantly recognized as correct. The out-of-left-field kind of breakthrough should be one that isn't obviously correct, because if it were, then it would have been solved anyway as soon as somebody smart enough thought about it for long enough. Right? So does your position here that Chomsky's UG theory is a breakthrough of the more original kind contradict your April 16th "moral certainty" argument that it couldn't really be any other way?

  2. Thanks for the pointer to TLT. But (just in case you'd want to know) it seems to be from 2008, so not something that "just appeared."

  3. On Einstein and SR: I'm no expert (which has never stopped me teaching it), but, on my understanding, all the pieces of SR were in place by the time Einstein came around in 1905 (the Lorentz transformation, the speed of light, the Michelson-Morley results, the Minkowski geometry, etc.). What was great about Einstein, apart from his elegant way with the transformation, slightly different from Lorentz and Poincare, was to put the pieces together in a way that is metaphysically whacko, but also exactly right! It is as if Einstein took the side of electromagnetism against Newtonian mechanics, and swallowed the consequences, whereas the 'standard problem' was to show how everything reduced to the mechanics. So, I don't think it was so much a matter of lots of people edging towards the right result, but lots of people going off in the wrong direction because of certain metaphysical preconceptions. Einstein famously credited Mach in this regard, although there is a kinda idealism lurking in the background, which Einstein latter acknowledged. I think there are lots of nice analogies hereabouts with GG. For example, I've never got the weight of the accusation that Chomsky 'stole' the idea of transformations from Harris. Even were that true, which it isn't, it would be like saying Einstein stole the transformation (sorry for the pun) from Lorentz. 'Oh, no! Einstein the idiot thief!'

    1. Yes, or Pullum's claim that Chomsky's invention of generative grammar is not original, because Syntactic Structures is all stolen from Emil Post (Pullum 2011, On the mathematical foundations of Syntactic Structures, J. Log Lang Inf).

  4. This new study in eNeuro looks like grist for your Gallistelian mill: RNA from Trained Aplysia Can Induce an Epigenetic Engram for Long-Term Sensitization in Untrained Aplysia -- they trained a slug, and then injected RNA from it into another slug, which behaved as though trained.

  5. Peter: I think GP makes some nice points in that paper, but, right, it ignores the explanatory matters to which Chomsky puts the technical machinery, harps on the limit proof for PSGs (even though Chomsky WAS right!), and the issue about Post is a bit odd - he is cited but not enough!

    1. Yes, besides the misdirected fury at the absence of a formal proof, I didn't feel it made the central case that it tried to make, that GG "springs directly out of the work of ... Post," reducing Chomsky’s role to midwifery. But I have to admit I have a soft spot for the paper because it cites my dad, so I’m happy to have reasons to like it, if you want to tell me what points impressed you.

    2. Your dad was Lars Svenonius! I looked at some of his stuff during my MA. I'll have to re-read the paper to remember what I like about it. The tone got in the way, but I think I liked the gathering of the material in one place, as it were, and while the 'fury' was misdirected, he is right on the detail. Still, the big wonderful picture is missed.

    3. What I think Chomsky did was re-establish 'succinctness' as scientifically meaningful; the pre-Bloomfieldians seemed to have just assumed that it was (c.f the Tübatulabal morphophonemics sketch in the Joos volume), it was lost for a while under the influence of Bloomfield (whose own actual position is pretty hard to read, I think I remember from trying to understand it several decades ago), then C reinstated it as fundamental to explaining why language is learnable. With the advantage of much better formal tools created by Post, etc.

      Note in particular that putting succinctness front and center defuses many issues concerning the arguments that language is not finite state; we must not only come up with some scheme that can produced observed corpora, but also explain the fact that the structure center-embedded things is pretty much the same as that of non-center embedded things, which is very easy to do with a PSG but not with an FSG. And likewise for many other phenomena, including most people's favorite ones.