Here are some readables.
First, for those interested in some background reading on the DMZTP paper (Nai Ding, Lucia Melloni, Hang Zhang, Xiang Tian and David Poeppel) discussed here can look at Embick & Poeppel (E&P) (part deux) here. This 2015 paper is a sequel to a much earlier 2005 paper outlining the problems of relating work on the brain bases of linguistic behavior and linguistic research (discussed here). The paper situates the discussion in DMZTP against a larger research program in the cog-neuro of language.
E&P identifies three neuroling projects of varying degrees of depth and difficulty.
1. Correlational neuroling
2. Integrated neuroling
3. Explanatory neuroling
DMZTP fits neatly into the first project and makes tentative conjectures relevant to the second and third. What are these three projects. Here’s how E&P describes each (CR=computational/representational and NB=Neurobiological) (360):
Correlational neurolinguistics: CR theories of language
are used to investigate the NB foundations of language.
Knowledge of how the brain computes is gained by
capitalising on CR knowledge of language.
Integrated neurolinguistics: CR neurolinguistics plus the
NB perspective provides crucial evidence that adjudicates
among different CR theories. That is, brain data enrich our
understanding of language at the CR level.
Explanatory neurolinguistics: (Correlational + Integrated
neurolinguistics) plus something about NB structure/
function explains why the CR theory of language involves
particular computations and representations (and not
The whole paper is a great read (nothing surprising here) and does a good job at identifying the kinds of questions worth answering. It’s greatest virtue, IMO, is that it treats results both in linguistics and in cog-neuro respectfully and asks how their respective insights can be integrated. This is not a program of mindless reduction, something that is unfortunately characteristic of too much current NB work on language.
Second, here’s a piece on some big methodological goings-on in physics. The question, relevant to our little part of the scientific universe, is what makes a theory scientific. It seems that many don’t like string theory or multiverses and think them and the thoeires that make use of them unscientific. Philosophers are called in to help clear the muddle (something that physicists hate even the idea of, but times are desperate it seems) and philosophers note that the muddle partly arises from mistaking the hallmarks of what makes something science. Popper and falsificationism is now understood by everyone to be a simplification at best and a sever distortion with very bad consequences at worst.
Chomsky once observed that big methodological questions of what makes something science is usefully focused on those areas of our greatest scientific success. I think that this is right. However, I also think that listening in on these discussions is instructive. Here’s a place where eavesdropping might be fun and instructive.
Third, here’s a Tech Review article on some recent work by Tenenbaum and colleagues on handwriting recognition (funny, just as cursive is on the cusp of disappearing, we show how to get machines to recognize it. The curious twists and turns of history). The research described is quite self-consciously opposed to deep learning approaches to similar problems. Where does the difference lie? Effectively the program uses a generative procedure using “strokes of an imaginary pen” to match the incoming letters. Bayes is then used to refine these generated objects. In other words, given a set of generative procedures for constructing letters, we can generate better and better matches to input through an iterative process in a Bayes like framework. And there is real payoff. In putting this kind of generative procedure into the Bayes system, you can learn to recognize novel “letters” from a very small number of examples.
Sound familiar? It should. Think Aspects! So, it looks like the tech world is coming to appreciate the power of “innate” knowledge, i.e. how given information can be used and extended. Good. This is just the kind of stories GGers should delight in.
How’s this different from the deep learning/big data (DL/BD) stuff? Well, by packing in prior info you can “learn” from a small number of examples. Thus, this simplifies the inductive problem. Hinton, one of the mucky-mucks in the DL/BD world notes that this stuff is “compatible with deep learning.” Yup. Nonetheless, it fits ill with the general ethos behind the enterprise. Why? Because it exploits an entirely different intuition concerning how to approach “learning.” From the few discussions I have seen, DL/BD starts from the idea that get enough data and learning will take care of itself. Why? Because learning consists in extracting the generalizations in the data. If the relevant generalizations are there in the data to be gleaned (even if lots of data is needed to glean it as there is often A LOT of noise obscuring the signal) then given enough data (hence the ‘big’ in Big Data) learning will occur. The method described here questions the utility of this premise. As Tenenbaum notes:
“The key thing about probabilistic programming—and rather different from the way most of the deep-learning stuff is working—is that it starts with a program that describes the causal processes in the world,” says Tenenbaum. “What we’re trying to learn is not a signature of features, or a pattern of features. We’re trying to learn a program that generates those characters.”
Does this make the two approaches irreconcilable? Yes and no. No, in that one can always combine two points of view that are not logically contradictory, and these views are not. But yes in that what one takes to be central to solving a learning problem (e.g. a pre-packaged program describing the causal process) is absent form the other. It’s once again the question of where one thinks what the hard problem is: describing the hypothesis space or the movements around it. DL/BD downplays the former and bets on the latter. In this case, Tenenbaum does a Chomsky and bets on the former.
As many of you know, I have had my reservations concerning many of Tenenbaum’s projects, but I am think he is making the right moves in this case. It always pays to recognize common ground. DL/BD is the current home of unreconstructed empiricism. In this particular case, Tenenbaum is making points that challenge this worldview. Good for him, and me.