Comments

Friday, June 8, 2018

Science without theory

Sometimes we just don't know much and this puts us in an odd epistemic position. Not knowing much comes with an imperative of intellectual modesty: one should have relatively little confidence in one's descriptions of what is going on and even less in projections of what might happen counterfactually. Being ignorant is a real bummer, especially scientifically.

Now, all of this should be obvious. And to many it is. But is is also something that working scientists have a professional interest in "bracketing" (a terms that roughly means "setting aside" that I learned as a philo grad student and that has come in very handy over the years as it sounds so much better than "ignore" (something an honest intellectual should not do (purportedly)) which is more or less what it amounts to), and so, not surprisingly, they largely do. Moreover, as nobody gets kudos for advertising their ignorance ((well, there was Socrates, I guess) or less kudos and occasionally a hemlock milkshake), scientists, especially given the current incentive system, are less restrained in making the flimsiness of their conclusions apparent than perhaps they should be. And this is a problem, for it is really hard to say anything useful when you have no idea what the hell is going on.

Why do I mention this? Rob Chametzky sent me a recent paper on this topic (here). The post (the author is Denny Borsboom (here), so I will dub the post DB) makes the reasonable point (reasonable to me as I have been making it as well for a while now) that the absence of "unambiguously formalized theory in psychology" lies behind much of the "replication" crisis in psych (p.1).[1]It, moreover, suggests that this is more or less endemic to the discipline becausepsychology has the hardest subject matter ever studied(p. 3). I do not know whether this last point is correct, but it is certainly true that much of what psychologists insist on studying is almost surely the product of many many interacting systems (e.g. almost any social psych topic!) and it is well-known that interaction effects are very hard to disentangle, very very hard. So, it is not surprising that fit hese topics are the sorts of things that psychologists study then the level of non-trivial theory that exists is close to zero. That is what one would expect (and what one finds). DB traces out some of the implications of this for the replication crisis. Here are some consequences:

·      The field is heavily stats dependent, as stats methods substitute for theoretical infrastructure.
·      The role of stats can grow so great as to induce theoretical amnesia on the practitioners (a mental state wherein those in the field no longer know what a theory is(p.2).
·      Progress in atheoretical psych is necessarily very slow given that experiments are always trying to factor out poorly understood context dependent variables.
·      The discipline is susceptible to fadsbased on poorly tested generalizationsthat serve to make research manageable (at best) and allows for a kind of predatory free-riding (at worst).

Needless to say, this is not a pretty state of affairs. The solution for this? Well, of course, more care with the stats and a kind of re-education system for psychologists: It would be extremely healthy if pshychologists received more education in fields which do have some theories, even if they are empirically shaky onesso that the discipline can try to remember what a theory is and what it is good for, so that we dont fall into theoretical amnesia(p.3). 

I cannot say that I find DB's description all that far off base. However, I think that a few caveats are in order. Here are some.

First, why is theoretical amnesia a bad thing if the field is doomed to be forever theoryless given its endemic difficulty? It is useful to understand how theory functions in a field where it does so usefully if this utility can be imported into one's own. Then being able to recognize it and value it is important. But if this is impossible then why bother?  

I suspect that DB's real gripe is that there is theory to be had (maybe by reshaping the topics studied) but that psychologists have been trained to ignore it and to substitute stats methods for theoretical insight. If this DB's point, then it is an important one. And it applies to many domains where empirical methods often overrun their useful boundaries. If this is DB's point, then there is a better way to put it: stats, no matter how technically fancy, cannot substitute for theory. Or, to put this in lay terms: lots of data carefully organized is not a theory, and thinking it is is just a confusion.

Second, the general point that RB makes (and I agree with) is not at all idiosyncratic. Gellman has made the point here (again) recently. The last paragraph is a good summation of RB's basic point:

…hypotheses in psychology, especially social psychology, are often vague, and data are noisy. Indeed, there often seems to be a tradition of casual measurement, the idea perhaps being that it doesn’t matter exactly what you measure because if you get statistical significance, you’ve discovered something. This is different from econ where there seems there’s more of a tradition of large datasets, careful measurements, and theory-based hypotheses. Anyway, psychology studies often (not always, but often) feature weak theory + weak measurement, which is a recipe for unreplicable findings.

Having little theory is real problem even if one's aim is to get a decent stats description of the lay of the land. The problems one finds in theoryless fields is what one should expect and the methodological sloppiness comes with the territory. As Gellman puts it:

p-hacking is not the cause of the problem; p-hacking is a symptom. Researchers don’t want to p-hack; they’d prefer to confirm their original hypotheses. They p-hack only because they have to.

If this is right, however, I think that both Gellman's and RB's optimism that this can be solved by better methodological hygiene is probably unfounded. The problem is that there is a real cost to not knowing what the hell is going on, and that cost is not merelytheoretical but observationalas well. Why? 

Here is a good place to play the Einstein card (roll the drums): theory is implicated in determining what counts as observational! Here's a quote: It is the theory which decides what we can observe.[2]To know what to count and how to count it (that's what stats does) you need a way of determining what should be counted and how (that's what theory does). So, no theory, no observations in the relevant scientific sense either. And if this is so, then when you really have no idea what the hell is going on, then you are in deep doo-doo whether you know it or not. Can stats help? I doubt it. Being very careful and very cautious might help. But really the only thing to do in these cases is pray to the God of scientific traction for a bit of luck in getting you started. There is a reason why researchers who figure out anything are heros. They are the people whose ideas allow us to get the ball rolling. Once it's rolling it's a whole new game. Until then. Nada! So, I doubt that the techno optimism that Gellman and RB point to, the idea that sans theory stats can step in and allow us to do another kind of useful science, will really fly. But, then I am a pessimist in general.

Third, I don't think that what holds for social psych is characteristic of the whole endeavor. There are large parts of psych broadly understood (e.g. large parts of learning theory (see Gallistel's work on this), development (see Carey and Spelke and Baillargeon and R. Gelman a.o. for example), perception, math capacities, language, where there is quite a bit of decent non-trivial theory that usefully guides inquiry). The problem is that social psych is where the fame and money are. You get on NPR for work on power poses, but not on Weber's law. 

Fourth, this is really not the state of play in most of linguistics. We really do have some decent theory to fall back on in many parts of the core discipline (syntax, phonology, parts of semantics) and that is why we have been able to make non-trivial progress. The funny thing is that if RB and Gellman are correct, the brouhaha over linguistic data foisted upon the field by the stats inclined has things exactly backwards for if they are right the methods adopted by fields without any theoretical ideas are bad models for those that have some theoretical sub-structure.

Fifth, what RB and Gellman describe is really what we should expect. There is a long-standing hope that there exists a mechanical way of doing science (i.e. gaining insight). If we were just careful enough in how we gathered data, if we only got rid of our pre-conceptions, if only our morals were higher, we could just look and see the truth. The problem this simple method fails is because we don’t do it right. This is, of course, a reflection of the old Empiricist dream (see the previous post for the logical Positivist version of this). It repeatedly fails. It will always fail. 

That’s it. Thx to Rob for sending me the RM piece. 


[1]The adjective “formalized” actually understates the problem. There is precious little non-trivial theory in large parts of psych (especially social psych, the epicenter of the crisis), formalized or not. The problem is not “formalization”per se.  
[2]Quoted in What is realby Adam Becker, p. 29. The philosopher Gorver Maxwell made a similar point oh so many years ago: “It is theory…which tells us what is or is not…observable” (Becker 184). If this is right, then the idea of grounding science on some a prioriconception of the observable independent of any theoretical assumptions is pretty much a non-starter. We indulge in “theory” either explicitly or tacitly. The optimism arises, I believe, by ignoring this or by hoping that for much of what we look at the tacit theory is pretty solid. Given that such theory will, when inexlicit, revert to “common sense,” this hope strikes me as idle given that common sense is precisely what scientific insight almost always overturns.

Sunday, June 3, 2018

Some summer reading on why often fail to understand one another

I just read a terrific review (here) of two books that deal with very large philosophical themes (meaning, truth, relativism, the aims of science etc.) that others might find amusing to look at. The review is by Tim Maudlin, a very very good philosopher, and it not only serves as an excellent advertisement for the two books under review (which sound tremendously amusing aside from being enlightening) but is well worth reading in itself. It's a paradigm of what a good review should be: it is engaging, informative, provocative and makes you want to read the originals.

The first part reviews a book by Adam Becker on quantum mechanics and some conundra that beset it. I like reading this kind of thing because it suggests that the problems that I see endemic to discussions in the mental sciences, is part of debate in the "real" sciences as well (at least if one takes quantum mechanics to be what a real successful science looks like (which I do)). The discussion reprises a debate between the heavy weights of 20th century science. One the one side we have Einstein and Schrodinger and on the other Bohr Heisenberg and the Copenhagen School. What Maudlin, following Becker, outlines is a kind of fruitless debate where those that "won - and would continue to win - all the logical battles" would nonetheless decisively loose the war of ideas (actually, Maudlin says "propaganda war" so the ideas war though lost was not seen to have been lost by most of the practitioners of the field) (p.9).

How won the battles and lost the war? Einstein. Who Lost the battles and won the war? Bohr. But what is striking in the telling is not the winners and losers, but Becker's and Maudlin's claim that this winning and loosing really had nothing to do with the substance of the debates because the winning side never really came to terms with the arguments of the loosing side. As Maudlin puts it (p.9):
Bohr never cam to grips with [Einstein's] argument. Indeed, it is unclear whether he ever understood it.
How could this be? In part because Einstein and Bohr had very different conceptions of then aims of scientific inquiry (Einstein shooting for explanation, Bohr for data coverage), which was in turn rooted in different conceptions of where the cognitive content of a theory comes from (the Positivist view that observational equivalence exhausts the whole content of "meaning" vs the rejection of this).  This made it very hard for those Einstein criticized to address his concerns seriously, which mean that they ended up addressing them non-seriously, if at all.

The remarkable feature of the story that Becker tells (a la Maudlin) is that Einstein's defeat came mainly by denigrating Einstein's powers in later life ("Einstein defeated, drifts into crank hood, never more doing significant physics" (10)) and by ignoring the arguments, and not just by happenstance but in as an organizing principle of "debate." The Oppenheimer quote  wrt David Bohm gives a taste for what opponents of the Copenhagen view were in for: "If we cannot disprove Bohm then we must agree to ignore him." So, in place of argument, there was propaganda and, it appears some concentrated efforts at suppression (and Bohm is not the only one ignored). All of this became institutionalized via the mis-education of future physicists. It seems that not only in the mental sciences are people sure that books and papers they have never read make claims that have never been made.

The themes Becker pursues are further investigated in the other book that Maudlin reviews by Errol Morris. Again issues of meaning lie at the heart of the discussion. I will leave it to you to take a look, but it sounds like a terrific read, albeit one that has some problems (see here for another review, also favorable, but more critical).

Summer vaca is the best perk of academic life. If you are lucky (and too many are not), you get several months to just think and work and rejuvenate yourself intellectually. Here are two books to promote thought and amusement. They will also help you appreciate, by looking at another domain that what look to be mere philosophical views cut deep. This does not mean that people with different views of "what counts" cannot argue with each other rationally. They can. But hey can, it appears, only get so far before the inevitable misunderstandings generated by their starting points clouds the relevant issues. When smart people really don't appear to be engaging intellectually, then you can bet that some tacit vital philosophical assumption is at issue, one that being tacit is not engaged and hence serves to stifle discussion. Should I now mention the Empiricism/Rationalism divide? Nah!

Monday, May 28, 2018

METI-linguistics

Bill Idsardi



On CNET (not usually a place to find anything about linguistics) coverage of the International Space Development conference and Messaging for Extra-terrestrial intelligence (METI).

Chomsky is quoted in the article as quipping, "To put it whimsically, the Martian language might not be so different from human language after all."

Bridget Samuels also made a presentation at the METI session. Bridget, maybe you can make a brief comment or two?


Friday, May 25, 2018

Three quintessentially minimalist projects

I was in Barcelona last week giving some lectures on the triumphant march forward of the Minimalist Program (MP). As readers may know, I believe that MP has been a success in its own terms in that it has gone a fair way towards answering the questions it first posed for itself, viz. Why do we have the FL we actually have and not some other? Others are more skeptical, but I believe that this is mainly because critics demand that MP address questions not central to its research mission. Of course, answering the MP question might leave many others untouched, but that hardly seems like a reason to disparage MP so much as a reason for pursuing other programs simultaneously. At any rate, this was what the lectures were about and I thank the gracious audience at the Autonomous University of Barcelona for letting me outline these views for their delectation. 

In getting these lectures into shape I started thinking about a question prompted by recent comments from Peter Svenonius (thx, see here). Peter thinks, if I understand him correctly, that the MP obsession (ok, my obsession) with Darwin’s Problem (DP) really adds nothing to the MP enterprise. Things could proceed more or less as we see them without this biological segue. This got me thinking about the following question: Which MP projects are largely motivated by DP concerns? I can think of three. They may well be motivated on other grounds as well. But they seem to me a direct consequence of taking the DP perspective on the emergence of FL seriously. This post is a first stab at enumerating these and explaining why I think they are intimately tied to DP (in much the same way that the P&P project was intimately tied to Plato’s Problem (PP)). So what are the three lines of inquiry? A warning: The order of discussion does not imply anything about their relative salience or importance to MP. I should note, that many of the points I make below, I have made elsewhere and before. So do not expect to be enlightened. This is probably more for my benefit than for yours.

First, Unification of the Modules (UoM). MP is based on the success of the P&P program, in particular the perceived success of GBish conceptions of FL/UG. Another way of saying this is that if you don’t think that GB managed to limn a fairly decent picture of the fine structure (i.e. the universals) of FL/UG then the MP project will seem to you to be, at best, premature and, at worst, hubristic. 

I believe that a lot of the problems that linguists have with MP has less to do with its failure to make progress in answering the MP question above, than with the belief that the whole project presupposes accepting as roughly right a deeply flawed conception of FL/UG (viz. roughly the GB conception). So, for example, if you don’t like classical case theory (and many syntacticians today do not) then you won’t like a project that takes it to be more or less accurate and tries to derive its properties from deeper principles. If you don’t think that classical binding theory is more or less correct then you won’t like a project that tries to reduce it to something else. The problem for many is that what MP presupposes (namely that GB was roughly correct, but not fundamental) is precisely what they believe ought to be very much up for grabs. 

I personally have a lot of sympathy for this attitude. However, I also think that it misses a useful feature of the presupposition. MP motivated unification/reduction doesn’t require that the GB description be correct so much that it be plausible enough (viz. that it be of roughly the right order of complexity) as to make deriving its properties a useful exercise (i.e. an exercise that if successful will provide a useful modelfor future projects armed with a more accurate conception of FL/UG). To put this another way: the GB conception of FL/UG has identified design features of FL/UG (aka, universals) that are theoretically plausible and empirically justifiable and so it is worth asking why principles such as theseshould govern the workings of our linguistic capacities. In Poeppel’s immortal phrase, these GB principles are the right grain size for analysis, have non-trivial empirical backing and so the exercise of showing why they should be part of FL would be doing something useful even should they fail to reflect the exact structure of FL.[1]

So, a central project animated by DP is unification of the disparate GB modules. And this is a very non-trivial project. As many of you know, GB attributes a rather high degree of internalmodularity to FL. There are diverse principles regulating binding vs control vs movement vs selection/subcategorization vs theta role assignment vs case assignment vs phrase structure. From the perspective of Plato’s Problem, the diversity of the modules does not much matter as their operations and principles are presumed to be innate (and hence not learned). In fact, the main impetus behind P&P architectures was to isolate the plausibly invariant features of Gs, and explain them by attributing them to the internal workings of FL thereby constraining the Gs FL produces to invariably respect these features. Thus the reason that Gs are always structure dependent is that FL has the property of only being able to construct Gs that are structure dependent. The reason that movement and binding require c-command is that FL imposes c-command as a condition on these diverse modular operations. The aim of GB was to identify and factor out the invariant properties of specific Gs and treat them as fixed features of FL/UG so that they did not have to be acquired on the basis of PLD (and a good thing too as there is not sufficient data in the PLD to fix them (aka PoS considerations apply to these)). The problem of G acquisition could then focus on the variable parts (where Gs differ) using the invariant parts as Archimedean fixed points for leveraging the PLD into specific Gs. That was the picture. And for this P&P project to succeed, it did not much matter how “complex” FL/UG was so long as it was innate. 

All of this changes, and changes dramatically, once one asks how this system couldhave arisen. Then the internal complexity matters, and matters a lot. Indeed, once one asks this question there is a great premium on simple FL architectures, with fewer modules and fewer disparate principles for the simpler the structure of FL, the easier it is to imagine how it mighthave arisen from the cognitive architecture of predecessors that did not have one. 

If this is correct, then one central MP project is to show that the diversity of the GB modules is only apparent and that they are only different reflections of the same underlying operations and principles. In other words, the project of unifying the modules is central to MP and it is central becauseof DP. A solution to DP requiresthat what appearsto be a very complex FL system (i.e. what GB depicts) is actually quite simple and what appearto be very different modules with different operations and regulative principles are really all reflections of the same underlying generative procedures. Why? Because short of this it will be impossible to explain how the system that GB describes could have arisen from a mind without it. 

This is entirely analogous, in its logic, to Plato’s Problem. How can kids acquire the Gs they do with the properties they have despite a poverty of the linguistic stimulus? Because much of what they know they do not have to learn. How could humans have evolved an FL from non-FL cognitive minds? Because FL minds are only a very small very simple step away from the minds that they emerged from and this requires that the modular complexity GB attributes to FL is only apparent. It’s what you get when you add to the contents of non-linguistic minds the small simple addition MP hypothesizes bridged the ling/non-ling gap.

Are there other plausible motives for such a project, the project of unifying the modules? Well perhaps. One might argue that an FL with unified modules are in some methodological sense better than one with non-unified ones. Something like a principle that says fewer modules are better than more. Again, I think that this is probably correct, but let’s face it, this kind of methodological Ockamist accounting is very weak (or at least perceived to be so). When push comes to shove data coverage (almost?) always trumps such niceties (remember the ceteris paribus clausethat always accompanies such dicta). So it is worth having a big empiricalfact of interest driving the agenda as well. And there are few facts bigger and heftier than the fact that FL arose from non-FL capable minds and it is easier to explain how this could havehappened if FL capable minds are only mildly different from non-FL capable minds and this means that the complex modularity that GB attributes to FL capable minds is almost certainly incorrect. That’s the line of argument. It rests on DPish assumptions and, to my mind, provides a powerful empirical motivation for module unification, which is what makes unification a central MP project.

It suggests a second related project: not only must the modules be unified, but the unification should makes use of the fewest possible linguistically proprietary operations and principles. In other words, linguistically capable minds, ones withFLs should be as minimally linguisticallyspecial as possible. Why? Because evolution proceeds most smoothly when there is minimal qualitative difference between the evolved states. If the aim is to explain how language ready minds appeared from non language ready minds than the fewer the differences between the two, the easier it will to be to account for the emergence of the former form the latter. If one assumes that what makes an FL mind language ready are linguistically special operations and principles then the fewer of these the better. In fact, in the best case there will be exactly a single relatively simple difference between the two, language ready minds just being non-language ready ones plus (at most) one linguistically special simple addition (the desideratum that it be simple motivated by the assumption that simple additions are more likely to become evolutionarily available than complex ones).[2]

So let’s assess: there are two closely related MP projects: unify the GB modules and unify them using largely non-linguistically proprietary operations and principles. How far has this project gotten? Well, IMO, quite far. Others are sure to disagree. But the projects though somewhat open textured have proven to be manageable and, the first in particular, has generated useful hypotheses (e.g. the Merge Hypothesis and extensions thereof, like the Movement Theory of Control and Construal), which even if wrong have the right flavor (Iknow, I know, this is self serving!). Indeed, IMO, trying to specify exactly where and how these theories go wrong (if they do, color me skeptical but I have dogs in these fights) and why they go wrong as they do, is a reasonable extension of the basic MP projects. It is a tribute to how little MP concerns drive contemporary syntax that such questions are, IMO, rarely broached. Let me rant a bit.

Darwin’s Problem (DP) currently enjoys as little interest among linguists today as Plato’s Problem (PP) does (and did, in earlier times). Indeed, from where I sit, even PP barely animates linguistic investigations. So, for example, people who study variation rarely ask how it might be fixed (though there are notable exceptions). Similarly, people who propose novel principles and operations rarely ask whether and how they might be integrated/unified with the rest of the features of FL. Indeed, most syntacticians take the basic apparatus as given and rarely critically examine it (e.g. how many people worry about the deep overlap between Agree and I-merge?). These are just not standard research concerns. IMO, sadly, most linguists could care less about the cognitive aspects of GG, let alone its possible bio-linguistic features. The object of study is language, not FL, and the technical apparatus is considered interesting to the degree that it provides a potentially powerful philological tool kit. 

Ok, so MP motivates two projects. There is one more, and it concerns variation. GB took variation to be bounded. It did this by conceiving UG as providing a finiteset of parameter values and conceived of language acquisition as fixing those parameters. So, even if the space of possible Gs is very large, for GB, it is finite. Now, given the linguistic specificityof the parameters, and given that GB treats them as internalto FL, the idea that variation is a matter of parameter setting proves to be a deep MP challenge. Indeed, I would go so far as to say, that ifMP is on the right track, thenFL does not contain a finite list of possible binary parameters and G acquisition cannot be a matter of parameter setting. It must be something else, something that is not specific to G acquisition. And this idea has caught on, big time. Let me explain.

I have many times mentioned the work by Berwick, Lidz and Yang on G acquisition. Each contains what is effectively a learning theory that constructs Gs from PLD using FL principles. It appears that this general idea is quite widely accepted now, with former parameter setting types (e.g. David Lightfoot) now arguing that “UG is open” and that there is “no evaluation of I-languages and no binary parameters” (1).[3]This view is much more congenial to MP as it removes the very specific parametric options fromFL and treats variation as entirely a “learning” problem. G learning is no different than other kinds, it is just aimed at Gs.[4]

Of course to make this work, will require specifying what kids come to the learning problem with, what kinds of data they exploit, and what the details of the G learning theory are. And this is hard. It requires more than pointing to differences in the PLD and attributing differences in Gs to these differences. However, this is a long way from an actual learning theory which specifies how PLD and properties of FL combine to give you a G. Not the least important fact is that there are many ways to generalize from PLD to Gs and kids only exploit some of these.[5]That said, if there is an MP “theory” of variation it will consist of adumbrating the innate assumptions the LAD uses to fix a particular G on the basis of PLD. To date, we have some interesting proposals (in particular from Lidz and Yang and their colleagues in syntax) but no overarching theory.

Interestingly, if this project can be made to fly, then it will also be the front end of an MP theory of variation. To date, the main focus of research has been on unifying and simplifying FL and trying to determine how much of FL is linguistically proprietary. However, there is no reason that the considerable current typological work on G variation shouldn’t feed into developing theories of learning aimed at explaining why we find the variation we do. It is just that thisproject is going to be very hard to execute well, as it will demand that linguists develop skills that are not currently part of standard PhD training, at least not in syntax (e.g. courses in stats, machine learning, and computation). But isn’t this as it should be? 

So, does taking MP seriously make a difference? Yes! It spawns three projects all animated by the MP problematic. These projects make sense in the context of trying to specify the internal structure of an FL that couldhave evolved from earlier minds. It suggests three concrete projects. So the programmatic aspects of MP are quite fecund, which is all that we can ask of a program.

And results? Well, here too I believe that we have made substantial progress as regards the first project, some as regards the third (though it is very very hard) and a little concerning the second.  IMO, this is not bad for 25 years and suggests that the DPish way of framing the MP issues has more than paid for itself.


[1]It’s worth adding that this sort of exercise is quite common in the real sciences. Ideal gases are not actual gases, planets are not point masses, and our universe may not be the only possible one but figuring out how they work has been very useful. 
[2]There is a lot of hand waving going on here. Thus, what evolves are genomes and what we are talking about here are phenotypic expressions thereof. We are assuming that simple genotypic difference reflect simple genetic differences. Who knows if this is right. However, it is the standard assumption for this kind of biological speculation so it would be a form of methodological dualism to treat it as suspect onlyin the linguistic case. See herefor discussion of this “phenotypic gambit” and its role in evolutionary thinking.
[3]See “Discovering New Variable Properties without Parameters,” in Massimo Piattelli-Palmarini and Simin Karimi, eds., “Parameters: What are they? Where are they?” Linguistic Analysis 41, special edition (2017).
            A very terse version of this view is advanced in Hornstein (2009) on entirely MP grounds. The main conceptual difference between approaches like Lightfoot’s and the one I advanced is that the former relies on the idea that “children DISCOVER variable properties of their language through parsing” (1), whereas I waved my hands and mumbled something about curve fitting given an enhanced representation provided by FL (see herefor slightly more elaboration).
[4]This folds together various important issues, the most important being that there is no overall evaluation metric for parameter setting. Chomsky argued that the shift from evaluation metrics to parameter setting modules increased the latters feasibility because applying global evaluation metrics to Gs is computationally intractable. I think Chomsky might have though that parameter setting is more localized than G evaluation and so will not require fancy learning theories. It turns, as Dresher and Kaye long ago noted, that parameter setting models have their own tractability issues unless the parameters can be set independently of one another. If they are not independent, problems quickly arise (e.g. it is hard to fix parameters once and for all). 
Furthermore, it is not clear to me that something like global measures of G fitness can be entirely avoided, though Lightfoot insists that they should be. The main reason for my skepticism is empirical and revolves around the question of whether the space of G options is scattered or not. At least in syntax, it seems that different Gs are kept relatively separate (e.g. bilinguals might code switch between French and English but they don’t syntactically blend them to get an “average” of the two in Frenglish. Why not?). This suggests that Gs enjoy an integrity and this is what keeps them cognitively apart. Bill Idsardi tells me that this might be less true on the sound side of things. But as regards the syntax, this looks more or less correct. If it is, then some global measure distinguishing different Gs might be required. 
I should add that more recently, if I recall correctly, Fodor and Sakas have argued that the evaluation metric cannot be completely dispensed with even on their “parsing” account.
[5]So, for example, invoking “parsing” as the driver behind acquisition does not do much unless one specifies howparsing works. Recall that standard parsers (e.g. the Marcus Parser) embody Gs that guide how it is that input data is analyzed. No G, no parsing. But if the aim is to explain how Gs are acquired then one cannot presuppose that the relevant G already exists as part of the parser. So what does a parse consist in in detail? This is a hard problem and it turns out that there are many factors that the child uses to analyze a string so as to recover a meaning. The MP project is to figure out what this is, not to name it.

Tuesday, May 22, 2018

David Poeppel in Quanta

David P has been a strong critic of cog-neuro practice. He has not minced words about how the field has badly misfired by not appreciating the Marrian complexity of research. Today this view gets some great publicity (here). Quanta (which is kinda like the New Yorker for science stuff) has published a 4 page discussion of his work and his critical views. Take a look and enjoy. It's nice when the right views get some decent press. Who knows, maybe next Bernie will enjoy some decent innings.

Wednesday, May 16, 2018

Talk about confirmation!!

As Peter notes in the comments section to the previous post, there has been dramatic new evidence for the Gallistel-King Conjecture (GKC) coming from David Glanzman's lab at UCLA (here). Their experiment on Aplysia. Here is the abstract of the paper:

The precise nature of the engram, the physical substrate of memory, remains uncertain. Here, it is reported that RNA extracted from the central nervous system of Aplysia  given long-term sensitization training induced sensitization when injected into untrained animals; furthermore, the RNA-induced sensitization, like training-induced sensitization, required DNA methylation. In cellular experiments, treatment with RNA extracted from trained animals was found to increase excitability in sensory neurons, but not in motor neurons, dissociated from naïve animals. Thus, the behavioral, and a subset of the cellular, modifications characteristic of a form of nonassociative long-term memory in Aplysia  can be transferred by RNA. These results indicate that RNA is sufficient to generate an engram for long-term sensitization in Aplysia  and are consistent with the hypothesis that RNA-induced epigenetic changes underlie memory storage in Aplysia.
Here is a discussion of the paper in SciAm.

The results pretty much speak for themselves and they clearly comport very well with the GKC, even the version that garnered the greatest number of superciliously raised eyebrows when mooted (viz. that the chemical locus of memory is in our nucleic acids (RNA/DNA). The Glanzman et. al. paper proposes just this.

A major advantage of our study over earlier studies of memory transfer is that we used a
type of learning, sensitization of the defensive withdrawal reflex in Aplysia , the cellular and molecular basis of which is exceptionally well characterized (Byrne and Hawkins, 2015; Kandel, 2001; Kandel, 2012). The extensive knowledge base regarding sensitization in Aplysia  enabled us to show that the RNA from sensitized donors not only produced sensitization-like behavioral change in the naïve recipients, but also caused specific electrophysiological alterations of cultured neurons that mimic those observed in sensitized animals. The cellular changes observed after exposure of cultured neurons to RNA from trained animals significantly strengthens the case for positive memory transfer in our study. Another difference between our study and earlier attempts at memory transfer via RNA is that there is now at hand a mechanism, unknown 40 years ago, whereby RNA can powerfully influence the function of neurons: epigenetic modifications (Qureshi and Mehler, 2012). In fact, the role of ncRNA-mediated epigenetic changes in neural function, particularly in learning and memory, is currently the subject of vigorous investigation (Fischer, 2014; Landry et al., 2013; Marshall and Bredy, 2016; Nestler, 2014; Smalheiser, 2014; Sweatt, 2013). Our demonstration
399  that inhibition of DNA methylation blocks the memory transfer effect (Fig. 2 ) supports the hypothesis that the behavioral and cellular effects of RNA from sensitized Aplysia  in our study are mediated, in part, by DNA methylation (see also Pearce et al., 2017; Rajasethupathy et al., 2012). The discovery that RNA from trained animals can transfer the engram for long-term sensitization in Aplysia  offers dramatic support for the idea that memory can be stored nonsynaptically (Gallistel and Balsam, 2014; Holliday, 1999; Queenan et al., 2017), and indicates the limitations of the synaptic plasticity model of long-term memory storage (Mayford et al., 2012; Takeuchi et al., 2014).


Two remarks: First, as the SciAm discussion makes clear, selling this idea will not be easy. Scientists are, rightfully in my opinion, a conservative lot and it takes lots of work to dislodge a well entrenched hypothesis. This is so even for views that seem to have little going for them. Gallistel (&Balsam) argued extensively that there is little good reason to buy the connectionist/associationist story that lies behind the standard cog-neuro commitment to net based cognition. Nonetheless, the idea is the guiding regulative ideal within cog-neuro and it is unlikely that it will go quietly. Or as Glanzman put it in the SciAm  piece:
“I expect a lot of astonishment and skepticism,” he said. “I don’t expect people are going to have a parade for me at the next Society for Neuroscience meeting.”
The reason is simple actually: if Glanzman is right, then those working in this area will need substantial retraining, as well as a big time cognitive rethink. In other words, if the GKC is on the right track, then what we think of as cog-neuro will look very different in the future than it does today. And nobody trained in earlier methods of investigation and basic concepts suffers a revolution gladly. This is why we generally measure progress in the sciences in PFTs (i.e. Plank Funereal Time).

Second, it is amazing to see just how specific the questions concerning the bio basis of memory become once one makes the shift over to the the GKC. Here are two questions that the Glanzman et. al. paper ends with. Note the detailed specificity of the chemical speculation:
Our data indicate that essential components of the engram for LTM in Aplysia  can be transferred to untrained animals, or to neurons in culture, via RNA. This finding raises two questions: (1) Which specific RNA(s) mediate(s) the memory transfer?, and (2) How does the naked RNA get from the hemolymph/cell culture medium into Aplysia  neurons? Regarding the first question, although we do not know the identity of the memory-bearing molecules at present, we believe it is likely that they are non-coding RNAs (ncRNAs). Note that previous results have implicated ncRNAs, notably microRNAs (miRNAs) and Piwi-interacting RNAs (piRNAs) (Fiumara et al., 2015; Rajasethupathy et al., 2012; Rajasethupathy et al., 2009), in LTM in Aplysia . Long non-coding RNAs (lncRNAs) represent other potential candidate memory transfer molecules (Mercer et al., 2008). Regarding the second question, recent evidence has revealed potential pathways for the passage of cell-free, extracellular RNA from body fluids into neurons. Thus, miRNAs, for example, have been detected in many different types of body fluids, including blood plasma; and cell-free extracellular miRNAs can become encapsulated within exosomes or attached to proteins of the Argonaut (AGO) family, thereby rendering the miRNAs resistant to degradation by extracellular nucleases (Turchinovich et al., 2013; Turchinovich et al., 2012). Moreover, miRNA-containing exosomes have been reported to pass freely through the blood-brain barrier (Ridder et al., 2014; Xu et al., 2017). And it is now appreciated that RNAs can be exchanged between cells of the body, including between neurons, via extracellular vesicles (Ashley et al., 2018; Pastuzyn et al., 2018; Smalheiser, 2007; Tkach and Théry, 2016; Valadi et al., 2007). If, as we believe, ncRNAs in the RNA extracted from sensitized animals were transferred to Aplysia  neurons, perhaps via extracellular vesicles, they likely caused one or more epigenetic effects that contributed to the induction and maintenance of LTM (Fig. 2 ).
Which RNAs are doing the coding? How are they transferred? Note the interest in blood flow (not just electrical conductance) as "cognitively" important. At any rate, the specificity of the questions being mooted is a good indication of how radically the filed of play will alter if the GKC gets traction. No wonder the built in skepticism. It really does overturn settled assumptions if correct. As SciAm puts it:
This view challenges the widely held notion that memories are stored by enhancing synaptic connections between neurons. Rather, Glanzman sees synaptic changes that occur during memory formation as flowing from the information that the RNA is carrying.
So, is GKC right? I bet it is. How right is it? Well, it seems that we may find out very very soon.

Oh yes, before I sign off (gloating and happy I should add), let me thank Peter and Bill and Johan and Patrick for sending me the relevant papers. Thx.

Addendum: Here's a prediction. The Glanzman paper will be taken as arguing that synaptic connections play no role in memory. Now, my completely uneducated hunch is that this strong version may well be right. However, it is not really what the Glanzman paper claims. It makes the more modest claim that the engram is at least partly located in RNA structures. It leaves open the possibility that nets and connections still play a role (though an earlier paper by him argues that it is quite unclear how they do as massive reorganization of the net seems to leave prior memories intact). So the fall back position will be that the GKC might be right in part but that a lot (most) of the heavy cog-neuro lifting will be done by neural nets. Here is a taste of that criticism from the SciAm report:
“This idea is radical and definitely challenges the field,” said Li-Huei Tsai, a neuroscientist who directs the Picower Institute for Learning and Memory at the Massachusetts Institute of Technology. Tsai, who recently co-authored a major review on memory formation, called Glanzman’s study “impressive and interesting” and said a number of studies support the notion that epigenetic mechanisms play some role in memory formation, which is likely a complex and multifaceted process. But she said she strongly disagreed with Glanzman’s notion that synaptic connections do not play a key role in memory storage.
Here is where the Gallistel arguments will really come into play. I believe as the urgency of answering Randy's question (how do you store a retrievable number in a connectionist net?) will increase for precisely the reasons he noted. The urgency will increase because we know how a standard computing device can do this and now that we have identified the components of a chemical computer we know how this could be done without nets. So those who think that connections are the central device will have to finally face the behavioral/computational music. There is another game in town. Let the fun begin!!