Incompetent statistics does not necessarily doom a research paper: some findings are solid enough that they show up even when there are mistakes in the data collection and data analyses. But we’ve also seen many examples where incompetent statistics led to conclusions that made no sense but still received publication and publicity.Someone once mentioned to me the following advice that they got in their first stats class (at MIT no less). The prof said that if you need fancy stats to drag a conclusion from the data generated by your experiment, then do another experiment. Stats are largely useful from distinguishing signal from noise. When things are messy, it can help you find out the underlying trends. Of course, there is always another way of doing this: make sure that things are not messy in the first place, and this is means make sure your design does not generate a lot of noise. Sadly, we cannot always do this and so we need to reach for that R package. But, more often than not, powerful techniques create a kind of moral hazard wrt our methods of inquiry. Sadly, there really is no substitute for thinking clearly and creatively about a problem.
Here's a second post for those clambering to understand what their statistically capable colleagues are talking about when they talk p-values. Look at the comments too, as some heavy weights chime in. Here's one that Sean Carroll (the physicist makes):
Particle physicists have the luxury of waiting for five sigma since their data is very clean and they know how to collect more and more of it.In this regard, I think that most linguists (those not doing pragmatics) are in a similar situation. The data is pretty clean and we can easily get lots more.