To better understand how neural connections are refined, researchers have long been recording the spontaneous firing patterns generated by developing neural circuits, including those of the retina. In a recent study in GigaScience Stephen Eglen from the University of Cambridge, UK, and colleagues curate a large volume of data from retinal electrophysiological studies into a comprehensive repository and analysis framework. This repository not only provides an important resource for researchers studying visual development but also stands out as one of only a few publically available electrophysiology datasets. In line with the reproducible research paradigm, the repository allows users to see and use the code that generated each figure and table of a study, bringing transparency to how the results were calculated and enabling other researchers to more easily build upon previous work.
Here the reviewers of the study (whose reports are freely available as part of GigaScience’s open peer-review policy), Thomas Wachtler from Ludwig-Maximilians-University Munich, Germany, and Christophe Pouzat from Paris Descartes University, France, share their thoughts on the issues of reproducibility and transparency raised by this study.
Why is the reproducible research paradigm important, and how does this study in GigaScience address this?
CP: Taking a somewhat ‘extreme’ stance, there is no (natural) science without reproducibility. If there is a long tradition of detailed description of experiments in the literature (making them mostly reproducible), the description standards of analysis/simulations associated to published experimental data have unfortunately been much ‘weaker’. Developing tools for the implementation of the reproducible research paradigm is very important to improve the situation; publishing papers, like the present one, describing simply and beautifully how the paradigm is implemented in a very relevant scientific context is also of paramount importance.
Do you think electrophysiological data presents a particular challenge in terms of sharing and reproducing data?
TW: A fundamental requirement for open data to be useful is not only technical accessibility but also practicability, that is, that both data and metadata are provided in standard or at least clearly documented and simple formats. The field of electrophysiology faces a notorious diversity and complexity of data and formats. To present the data in a unified way, Eglen and colleagues made impressive efforts to read the data from different formats and to convert and annotate them. Ideally, such efforts could be greatly reduced in the future if common standards would be established in the community. The International Neuroinformatics Coordinating Facility (INCF) Program on data sharing, also in which members of CARMEN, CRCNS.org, and G-Node are also actively participating, is working towards such standards.
Did you manage to test and recreate the analyses in the study and how long did it take you?
CP: It took me a couple of hours to get the data, the few custom developed routines, the ‘vignette’ (that is, in the open-source R statistical software jargon, an executable file mixing description of what the code is doing with the code itself) and to reproduce exactly the analysis presented in the study (using a netbook, not a heavy duty desktop computer). With a few more hours, I was able to modify the authors’ code to change a linear scale for a log scale in their Figure 4. In addition to making the presented research trustworthy, the reproducible research paradigm definitely makes the reviewer’s job much more fun!
The study authors went to great efforts to make the data and code available, and the methods transparent. Do you think this is worth doing and what can we do to encourage other authors to follow suit?
CP: Yes, I think it is worth it! I’m sure that by making data (and code) public, researchers will get more citations as well as attract more collaborations. But clearly the funding agencies have a big role in making scientists switch from the present attitude (or culture) where they consider ‘their’ data and code as private property (even when the work has been entirely funded by public money) towards a situation where they give access to both data and code by default. In order to share data, infrastructures (like the CARMEN virtual lab or the g-node) have to be created and maintained and scientists working on their development must get credit for that. This study would probably not have been possible without the CARMEN virtual laboratory.
TW: Sites that provide data hosting, like CARMEN, CRCNS.org, or G-Node, play an important role in enabling neurophysiologists to share their data, be it between colleagues or publicly, thus raising awareness for the benefits of data sharing. In some cases datasets have been made public when funding was provided to annotate and document the data. Here the incentive of a publication certainly played a role, which highlights the relevance of journals like GigaScience that enable data publications. Journals that offer data publications enable scientists to immediately gain benefits from sharing their data with the community.
Conventional journals can also help raise awareness of this possibility by explicitly encouraging practices that enhance openness and reproducibility. Whether it is necessary to go so far as to enforce this practice is a decision that a journal should consider carefully. We currently see a growing interest and willingness among neuroscientists to make their data available, so we can expect this to become common practice with time anyway.
Study author Stephen Eglen remarked: “We hope the data [in our study] will serve both as an example for developing future standards, as well as being used to address new scientific questions.” See what else Eglen has to say about this study and reproducibility in neuroscience in this GigaBlog Q&A.
More about the reviewer(s)
Thomas Wachtler is a group leader at the Ludwig-Maximilians-Universität München, Germany, and Director of the German National Neuroinformatics Node (G-Node) of the International Neuroinformatics Coordinating Facility (INCF). The Wachtler lab focuses on neuroinformatics developments for electrophysiology and on multidisciplinary research to study the neural principles of processing and coding in the visual system.
Christophe Pouzat is a CNRS (Centre national de la recherche scientifique) researcher at the Paris Descartes University, France. He obtained his PhD at the Max Planck Institute for Biophysical Chemistry under the supervision of Alain Marty, after which he joined the laboratory of Gilles Laurent at the California Institute of Technology, USA. Pouzat’s research interests centre on experimental neurophysiology, with a particular focus on data analysis in calcium imaging and spike sorting, as well as spike train analysis.