
New project, partly designed by a 探花直播 of Cambridge researcher, aims to improve transparency in science by sharing 鈥榟ow the sausage is made鈥.听
New project, partly designed by a 探花直播 of Cambridge researcher, aims to improve transparency in science by sharing 鈥榟ow the sausage is made鈥.听
Having the code means that others have a better chance of replicating your work.
Stephen Eglen
A new pilot project, designed by a Cambridge researcher and supported by the Nature family of journals, will evaluate the value of sharing the code behind published research.
For years, scientists have discussed whether and how to share data from painstaking research and costly experiments. Some are further along in their efforts toward 鈥榦pen science鈥 than others: fields such as astronomy and oceanography, for example, involve such expensive and large-scale equipment and logistical challenges to data collection that collaboration among institutions has become the norm.
Recently, academic journals, including several Nature journals, are turning their attention to another aspect of the research process: computer programming code. Code is becoming increasingly important in research because scientists are often writing their own computer programs to interpret their data, rather than using commercial software packages. Some journals now include scientific data and code as part of the peer-review process.
Now, in a published in the journal Nature Neuroscience, a group of researchers from the UK, Europe and the United States have argued that the sharing of code should be part of the peer-review process. In a separate , the journal has announced a pilot project to ask future authors to make their code available for review.
Code is an important part of the research process, and often the only definitive account of how data were processed. 鈥淢ethods are now so complex that they are difficult to describe concisely in the limited 鈥榤ethods鈥 section of a paper,鈥 said Dr Stephen Eglen from Cambridge鈥檚 Department of Applied Mathematics and Theoretical Physics, and the paper鈥檚 lead author. 鈥淎nd having the code means that others have a better chance of replicating your work, and so should add confidence.鈥
Making the programs behind the research accessible allows other scientists to test the code and reproduce the computations in an experiment 鈥 in other words, to reproduce results and solidify findings. It鈥檚 the 鈥渉ow the sausage is made鈥 part of research, said co-author Ben Marwick, from the 探花直播 of Washington. It also allows the code to be used by other researchers in new studies, making it easier for scientists to build on the work of their colleagues.
鈥淲hat we鈥檙e missing is the convention of sharing code or the tools for turning data into useful discoveries or information,鈥 said Marwick. 鈥淩esearchers say it鈥檚 great to have the data available in a paper 鈥 increasingly raw data are available in supplementary files or specialised online repositories 鈥 but the code for performing the clever analyses in between the raw data and the published figures and tables are still inaccessible.鈥
Other Nature Research journals, such as and provide for code review as part of the article evaluation process. Since 2014, the company has encouraged writers to make their code available upon request.
探花直播Nature Neuroscience pilot focuses on three elements: whether the code supporting an author鈥檚 main claims is publicly accessible; whether the code functions without mistakes; and whether it produces the results cited. At the moment this is a pilot project to which authors can opt in. It may be that in future it becomes mandatory and only when the code has been reviewed will a paper then be accepted.
鈥淭his extra step in the peer review process is to encourage 鈥榬eplication鈥 of results, and therefore help reduce the 鈥榬eplication crisis鈥,鈥 said Eglen. 鈥淚t also means that readers can understand more fully what authors have done.鈥
An open science approach to sharing code is not without its critics, as well as scientists who raise legal and ethical questions about the repercussions. How do researchers get proper credit for the code they share? How should code be cited in the scholarly literature? How will it count toward tenure and promotion applications? How is sharing code compatible with patents and commercialization of software technology?
鈥淲e hope that when people do not share code it might be seen as 鈥榟aving something to hide,鈥 although people may regard the code as 鈥榯heirs鈥 and their IP, rather than something to be shared,鈥 said Eglen. 鈥淣owadays, we believe the final paper is the ultimate representation of a piece of research, but actually the final paper is just an advert for the scholarship, which here is the computer code to solve a particular task. By sharing the code, we actually get the most useful part of the scholarship, rather than the paper, which is just the author鈥檚 鈥榞loss鈥 on the work they have done.鈥
Adapted from a 探花直播 of Washington .听
探花直播text in this work is licensed under a . For image use please see separate credits above.