23 February 2016

Misconceptions about academic data sharing

Written by Benedikt Fecher & Gert Wagner.

In a recent editorial in the New England Journal of Medicine, the authors Longo and Drazen critically assessed the concept of data sharing in medicine. Their main concern is that a “new class of research person will emerge” that uses data for their own original research questions. The authors, although indirectly, later refer to this class of researcher as “research parasites“. The label “research parasites” does certainly not reflect the zeitgeist of an increasingly collaborative research and initiatives towards openness and transparency. However, it reflects common misconceptions about academic data sharing.

Longo and Drazen make the (valid) point that data might be misinterpreted. On the other hand misinterpretation might be a matter of insufficient data documentation by primary researchers. Moreover, potential misinterpretation cannot be an argument for not sharing research data.

Longo and Drazen miss the very point of scientific research when they write, that the researchers may “even use the data to try to disprove what the original investigators had posited“. It is at the core of the scientific paradigm that researchers take nothing as final truth. This is what Popper proposed in his critical rationalism and Merton in his conceptualization of skepticism.

Longo’s and Drazan’s requirement to “start with a novel idea, one that is not an obvious extension of the reported work” is simply misleading. Especially medical research (which is the subject of Longo’s and Drazan’s) can immensely profit from old ideas through meta-analyses and replication studies that use original datasets.

However the authors touch upon a valid point: the issue of adequate credit for scientific data sharing. They indicate that the adequate form of recognition for data sharing is co-authorship. They suggest to work “symbiotically, rather than parasitically, with the investigators holding the data, moving the field forward in a way that neither group could have done on its own.”

While that is certainly true in particular cases, we argue that co-authorship as the solely instrument for giving credit will unnecessarily restrict the potential of data sharing and can even be to the detriment of the original researcher, for instance if the resulting publications lack quality. And in the case of replication studies, co-authorship makes no scientific sense.

The best instrument for giving “credit where credit is due” would be a much higher appraisal of data sharing by research communities via citations of data sets and the consideration of data “production” in career prospects, funding application and evaluations.

With this end in mind, this “new class of research person” is exactly the opposite of a “research parasite”. This person would be someone who is essential to the scientific enterprise in an increasingly data-intensive and collaborative environment. Longo and Drazen’s editorial however shows that there is still a long way to go before we reach Open Science.

This post represents the view of the author and does not necessarily represent the view of the institute itself. For more information about the topics of these articles and associated research projects, please contact info@hiig.de.