Lots of interesting points raised – the ‘idealised’ scientific method relies on falsifiability and therefore confirmation or falsification of previous work, but the modern practice of science particularly in academia is based on novelty of research and on many procedures that are borderline impossible to reproduce and replicate.
Obviously, a lot of what is published will be proven wrong in the future, and as such having something published in a ‘peer reviewed journal’ is no guarantee of the rightness of the findings – the best you can hope is that the study has been carried out correctly. Obviously there are areas that fall under the broad canopy of science for which this is a huge issue – nutrition and food science being one where papers with opposing conclusions seem to be published on an almost weekly basis (e.g. low carb / high protein or high carb / low protein, cholesterol etc)
One thing I was told at the outset of my PhD was ‘there is no such thing as bad data, just wrong interpretation’. Now clearly this is a little tongue in cheek, as you can have data that is misleading for any number of reasons (poor sampling, incorrect sample labelling or handling, inappropriate analytical methods, equipment failure being the first few I can think of), but the interpretation is then whether to include or exclude, and whether something mathematical / statistical can be done to improve the reliability of the data (e.g. applying a correction for instrumental drift). The key point however is that the raw data is there both for further investigation and for checking by others.
The Feynman anecdote up thread shows that these are not new problems, and as an aside I will say that in both my MSc and PhD research I failed to reproduce results consistent with previously published work. Now, whether this was through my incompetence in the lab or inadequate documentation of the methods in the literature is unknowable.
However, I think that there are reasons for thinking the issues around reproducibility and erroneous (although not fraudulent or fabricated) research are getting worse, for a variety of reasons:
1 – The ever-increasing prevalence of ‘publish or perish’ attitudes
2 – ‘Impact factors’, and the tendency of the most high profile journals (Nature, Science etc) to focus on headline-grabbing research, which of course is frequently those papers that find the most unexpected (and therefore most likely to be incorrect) results. Wasn’t it Carl Wunsch who said something along the line of ‘just because something is published in Nature doesn’t mean it is necessarily wrong’.
3 – Over-reliance on (and an infatuation with) technology. Think Monty Python’s ‘machine that goes ping’. It’s very easy to think of complex equipment or computerised statistical processes as ‘black boxes’. Input something at one end, push a couple of buttons and get something out the other that you treat as data. I suspect in many cases researchers know less than they are letting on about the processes within these black boxes, and so can often make inappropriate choices that impact the output. And then they have to try to explain it in an article (with some benefit from the supplementary information) to a degree that someone else could replicate the process. Of course, climate modelling has this issue writ large.
4 – Sometimes an unwillingness to share data, code or statistical processes in a manner that allows reasonable replication. Anyone who has followed ClimateAudit for a while knows the difficulties encountered in reverse-engineering the stats procedures used in some palaeo-climate papers and the unwillingness of researchers to share information. It’s back to Phil Jones’s comment of why should I give you my data, you’ll only try to find something wrong with it…
In most cases though, I think it is important to remember to never attribute to conspiracy (or malpractice) what can equally well be explained by cock-up.