Thursday, March 27, 2014

POS and the inverse problem in vision

About a month ago, Bill Idsardi gave me an interesting book by Dale Purves to read (here). Purves is a big deal neuroscientist at Duke who works on vision. The book is a charming combination of personal and scientific biography; how Purves got into the field, how it changed since he entered it and how his personal understanding of the central problem in visual perception has changed over his career. For someone like me, interested in language from a cog-neuro perspective, it’s fun to read about what’s going on in a nearby, related discipline.  The last chapter is especially useful for in it Purves presents a kind of overview of his general conclusions concerning what vision can tell us about brains.  Three things caught my eye (get it?).

First, he identifies the “the inverse problem” as the main cog-neuro problem within vision (in fact, in perception more generally). The problem is effectively a POS problem: the stimulus info available on the retina is insufficient for figuring out the properties of the distal stimulus that caused it. Why? Because there are too many ways that the pattern of stimulation on the eyeball could have been caused by environmental factors. This reminds me of Lila’s old quip about word learning: a picture is worth a thousand words and this is precisely the problem. So, the central problem is the inverse problem and the only way of “solving” it is by finding the biological constraints that allow for a “solution.”[1] Thus, because the information available at the eyeball is too poor to deliver its cause, yet we make generalizations in some ways but not others, there must be some constraints on how we do this that need recovering.  As Purves notes, illusions are good ways of studying the nature of these constraints for they hint at the sorts of constraints the brain imposes to solve the problem. For Purves, the job of the cog-neuro of vision is to find these constraints by considering various ways of bridging this gap.

This way of framing the problem leads to his second important point: Purves thinks that because the vision literature has largely ignored the inverse problem it has misconceived what kinds of brain mechanisms we should be looking for. The history as he retells it is interesting. He traces the misconception, in part, to two very important neuroscience discoveries: Hubel and Wiesel’s discovery of “feature detecting” neurons and Mountcastle’s discovery of the columnar structure of brains. These two ideas combined to give the following picture: perception is effectively feature detection. It starts with detecting feature patterns on the retina and then ever higher order feature patterns of the previously detected patterns. So it starts with patterns in the retina (presumably products of the distal stimulus) and does successive higher order pattern recognition on these. Here’s Purves (222-3):

…the implicit message of Hubel and Wiesel’s effort [was] to understand vision in terms of an anatomical and functional hierarchy in which simple cells feed onto complex cells, complex cells feed onto hypercomplex cells, and so on up to the higher reaches of the extratriate cortext….Nearly everyone believed that the activity of neurons with specific receptive field properties would, at some level of the visual system, represent the combined image features of a stimulus, thereby accounting for what we see.

This approach, Purves notes, “has not been substantiated” (223). 

This should come as no surprise to linguists. The failed approach that Purves describes sounds to a linguist very much like the classical structuralist discovery procedures that Chomsky and others argued to be inadequate over 50 years ago within linguistics.  Here too the idea was that linguistic structure was the sum total of successive generalizations over patterns of previous generalizations. I described this (here) as the idea that there are detectable patterns in the data that inductions over inductions over inductions would reveal. The alternative idea is that one needs to find a procedure that generates the data and that there is no way to induce this procedure from the simple examination of the inputs, in effect, the inverse problem.  If Purves is right, this suggests that within cog-neuro the inverse problem is the norm and that generalizing over generalizations will not get you where you want to go. This is the same conclusion as Chomsky’s 50 years earlier. And it seems to be worth repeating given the current interest in “deep learning” methods, which, so far as I can tell (which may not be very far, I concede), seems attracted to a similar structuralist view.[2] If Purves (and Chomsky) are right (and I know that at least one of them is, guess which) then this will lead cog-neuro down the wrong path.

Third, Purves documents how studying the intricacies of the cognition using behavioral methods was critical in challenging the implicit very simple theory common in the nuero literature. Purves notes how understanding the psycho literature was critical in zeroing in on the right cog-neuro problem to solve. Moreover, he notes how hostile the neuro types were to this conclusion (including the smart ones like Crick).  It is not surprising that the prestige science does not like being told what to look at from the lowly behavioral domains. So, in place of any sensible cognitive theory, neuro types invented the obvious ones that they believed to be reflected in the neuro structure. But, as Purves shows (and any sane person should conclude) neuro structure, at least at present, tells us very little about what the brain is doing. This is not quite accurate, but it is accurate enough.  In the absence of explicit theory, implicit “empiricism” always emerges as the default theory. Oh well.

There is lots more in the book, much of it, btw, that I find either oddly put or wrong. Purves, for example, has an odd critique of Marr, IMO. He also has a strange idea of what a computational theory would look like and places too much faith in evolution as the sole shaper of the right solutions to the inverse problem. But big deal. The book raises interesting issues relevant to anyone interested in cog-neuro regardless of the specific domain of interest. It’s a fun, informative and enjoyable read.


[1] I use quotes here for Purves argues that we never make contact with the real world. I am not a fan about this way of putting the issue, but it’s his.  It seems to me that the inverse problem can be stated without making this assumption: the constraints being one way of reconstructing the nature of the distal stimulus given the paucity of data on the retina.
[2] As the Wikipedia entry puts it: “Deep learning algorithms in particular exploit this idea of hierarchical explanatory factors. Different concepts are learned from other concepts, with the more abstract, higher level concepts being learned from the lower level ones. These architectures are often constructed with a greedy layer-by-layer method that models this idea. Deep learning helps to disentangle these abstractions and pick out which features are useful for learning.”

1 comment:

  1. So, as a likely way-out-in-left field remark, it's possible to interpret the nice mathematical properties of MGs as opposed to LFGs as at least partially a consequence that the latter implements sharing/movement as unification of substructures in an overt->covert structure derivation (where there aren't any useful constraints on the complexity of structure that gets unified), while the former do it by generating the covert structures with a context free process and then implement sharing/copying by propagation of pointers to the shared structures. iow having a constrained theory of the covert forms is useful.

    I think it is possible to adapt my 'Semantic Lexicon' idea for LFG (in my LFG08) paper, to serve as a generative component for f-structures that is regular (unless I've made some really dumb error, which is possible), which is hopefully a step towards reducing the gap between the frameworks.

    ReplyDelete