First, I have no idea if this is much worse than before. Before the modern era did science faithfully replicate its findings and things have gotten worse? Maybe a replication rate of 25% (the usual horror story number) is better than it used to be. How do we know what a good rate ought to be? Maybe replicating 25% of experiments is amazingly good. We need a base rate, and, so far, I have not seen one provided. And until I do see one, I cannot know whether we are in crisis mode or not. But I am wary, especially of decline from a golden age stories. I know we no longer live in an age of giants (nobody ever lives in a golden age of giants). The question is whether 50 years from now we will discover that we actually had lived in such a golden age. You know, when the dust has settled and we can see things more clearly.
Second, I think that part of the frustration with our current science comes from having treated anything with numbers and "experiments" as science. The idea seems to be that one can do idea free investigations. Experiments are good or not on their own regardless of the (photo)theory they are tacitly or explicitly based on. IMO, what makes the "real" sciences experimentally stable is not only their superior techniques, but the excellent theory that it brings to the investigative table. This body of knowledge serves as prophylactic against misinterpretation. Remember, never trust a fact until it has been verified by a decent theory! And, yes, the converse also holds. But the converse is taken as definitional of science while the role theory does in regulating experimental inquiry is, IMO, regularly under-appreciated.
So, I am skeptical. This said, there is one very big source of misinformation out there, especially in domains where knowledge translates into big money (and power). We see this in the global warming debates. We saw it on research into tobacco and cancer. Indeed, there are whole public relations outfits whose main activity is to spread doubt and misinformation dressed up as science. And recently we have been treated to a remarkable example of this. Here are two interesting pieces (here, here) on how the sugar industry shaped nutrition science quite explicitly and directly for their own benefit. These cases leave little to the imagination as regards science disrupting mechanisms. And they occurred a while ago, one might be tempted to say in the golden age.
As funding for research becomes more and more privatized this kind of baleful influence on inquiry is sure to increase. People want to get what they are paying for, and research that impinges on corporate income is gong to be in the firing line. If what the articles say is correct, the agnotology industry is very very powerful.
A second interesting piece for those interested in the Sapir-Whorf hypothesis. I have been persuaded by people like Lila that there is no real basis for the hypothesis, i.e. that one's particular language has, at best, a mild influence on the way that one perceives the world. Economists however are unconvinced. Here is a recent piece arguing that the gender structure of a language's pronoun system has effects on how women succeed sociopolitically. Here is the conclusion:
First, linguistic differences can be used to uncover new evidence such as that concerning the formation and persistence of gender norms. Second, as the observed association between gender in language and gender inequality has been remarkably constant over the course of the 20th century, language can play a critical role as a cultural marker, teaching us about the origins and persistence of gender roles. Finally, the epidemiological approach also offers the possibility to disentangle the impact of language from the impact of country of origin factors. Our preliminary evidence suggests that while the lion’s share of gender norms can be attributed to other cultural and environmental influences, yet a direct role language should not be ignored.Evaluating this is beyond my pay grade, but it is interesting and directly relevant to the Sapir-Whorf hypothesis. True? Dunno. But not uninteresting.
Another relevant readable is Andrew Gelman's monumental post from today -- he certainly doesn't seem to believe that the problem with social psychology is that it doesn't have enough theory.
ReplyDeleteDo you think that it is coincidental that we see no signs of a similar crisis in physicS or chemsitry or visual and auditory perception? Or, I would add in linguistics and psycho linguistics. It arises in bery theory poor domains like social psychology and medicine. In these latter domains there is very little non trivial insight to be had. Thats why they study himmicanes! So, IMO the problem really is the absence of insight of mechnaism below the surface visibles. We know nothing significant and so cannot weed out the preposterous. That is one of the reasons junk scienve thrives.
ReplyDeleteI am not sure Gelman would endorse this. I think he thinks that theory free inquiry is less fraught than I do.
I definitely agree that theory is important and is a significant part of the problem, but it's not the whole story. The problem goes deeper than those studies with patently ridiculous hypotheses, so it can be misleading to focus (as Gelman does) on the himmicanes study or on precognition. Measurement problems, underpowered studies, "p-hacking" etc affect studies that are motivated by very reasonable theories; I have certainly heard of several replication failures in psycholinguistics. We don't hear a lot about replication problems in linguistics in large part because most of the data is extremely robust (*what did you eat chocolate and), but also because we don't have great mechanisms for drawing attention to problematic judgments and replication failures. As Simine Vazire recently pointed out, there are higher barriers in psychology to publishing replication failures than new data, and the situation in linguistics seems to be similar: quibbles about the data aren't in themselves worth publishing because they're deemed not of theoretical interest.
DeleteYes, theory is not everything, but it is an important thing. It is AT LEAST a good compact way of coding what we have interesting evidence for, so a coding of what we (sorta) know. And it is often ignored when failures are discussed. Like I said, if the problems are sociological or a fall from prior grace why don't these failures hit the real sciences. Why don't we hear about replication failures in Physics going out of control. Are they not subject to the same publication pressures, incentives, rewards? Are they more virtuous? If you prick a physicist does s/he not bleed? I would say that in the real sciences there is a sense of what an absurd experimental result would look like and so it can be closely questioned.
DeleteNow, "I have certainly heard of several replication failures in psycholinguistics." Ands have I. BUT, there is no apparent epidemic, nor in perception either. Why not? The idea that there is an epidemic of bad behavior makes no sense if it is limited by domain of inquiry, especially when the malefactors are located in fields where we know next to nothing. Oh yes: and how many replication failures indicates scientific decline? Baselines would be nice.
"and the situation in linguistics seems to be similar: quibbles about the data aren't in themselves worth publishing because they're deemed not of theoretical interest." That has not been my experience. You are right that linguists are pretty lucky, at least in central areas of syntax. Judgments get a lot hairier when you start asking about semantic ambiguities involving 4 quantifiers. At any rate, it is the last part of the sentence I would object to: of course one measure of "interest" is the impact of the data point on what we think we know. If it is refractory to ALL our theory then why is it interesting. The idea that facts speak for themselves is a prejudice we would be well rid of. Part of the point of a fact is to explain how it speaks to anything. And anything in our little part of the world is how it speaks to SOME theoretical (i.e. non-trivial) concern.
BTW, would be nice to know about these failed replications: name names.
I think Gelman also appreciates that bad measurement and p-hacking and the like are problematic, but that you get a kind of synergistic nonsense with the conjunction of these, Gelman & Loken's Garden of Forking Paths, and weak theories.
DeleteSo, sure, you can get p-hacking causing problems even in fields with strong theoretical foundations, but that alone won't cause the kind of widespread problems that social psychology is exhibiting (and has since the 50s, at least according to one of the comments on Gelman's post).
I don't have any real evidence to back it up, but it also seems to me that bad measurement tends to go hand-in-hand with weak theories. Maybe the most obvious case of this is the famous age-priming study with walking speed measured as the old-ish outcome variable. Why walking speed and not any number of other behaviors that are stereotypically old?