Large and linked in scientific publishing: the launch of ‘big data’ journal GigaScience
12 Jul 2012
BGI, the world’s largest genomics institute, and BioMed Central, a leader in scientific data sharing, aim to revolutionize science publishing with the launch of GigaScience, a new open access, open data journal with a scope that embraces all life science research that generates ‘big data’.
This launch is a major first step towards the open access publication of complete, reproducible accounts of all parts of data-intensive scientific research projects. Together GigaScience and its integrated database GigaDB provide scientific analyses, full dataset hosting, and access to the software tools used to conduct these analyses, along with publication of more traditional scientific articles describing the studies.
Having all these together finally allows readers to not only glean the scientific conclusions in the papers, but also to directly test these using the underlying data and analysis tools. In this way, GigaScience offers a way to help overcome the growing problem of the lack of reproducibility of research. GigaScience publications also include Digital Object Identifier (DOIs) for all datasets in the journal database, GigaDB. This helps make datasets more permanent, as well as fully track-able, discoverable, linkable, and citable, which traditionally has only been possible for journal articles. Citation enables scientists, who generate these enormous datasets and share them with the community, to gain more appropriate credit for their contributions to research.
Laurie Goodman, Editor-in-Chief, says, “The full use of large-scale data has sadly lagged far behind our ability to produce it. The leaders of BGI realized they had the ability, given their vast computational resources, to create an innovative new journal format — one where enormous datasets could be fully hosted and directly linked to their original scientific studies. By including analysis tools in a data platform, as well as the planned addition of cloud technology later this year, GigaScience can serve as a means to put such data into the hands of researchers who do not have the vast computational resources required for optimal data use. This is in keeping with the goals of our co-publisher BioMed Central, which makes them the perfect partner in achieving this endeavour.”
Exemplifying GigaScience and GigaDB’s innovative approach to publishing, in the launch edition, is a research article from Stephan Beck’s group at the University College London, UK. This article focuses on ways to conduct whole-genome analyses of DNA methylation, an important mechanism that regulates gene expression. The article contains all of the supporting data and software tools needed to recreate the experiments — a total of 84 GB — freely available for download and reuse from GigaDB. Using BGI’s data storage capacity, GigaScience is able to host these and other files, which are far larger than any other journals are able to publish. GigaDB furthermore supports open data by giving up all copyright in published datasets by its use of the Creative Commons CC0 public domain dedication waiver. This allows anyone to access and reuse published data without restrictions.
This is part of a forward thinking, technology-driven approach to science publishing. As Publisher Iain Hrynaszkiewicz says, “We traditionally have only had access to limited amounts of scientific knowledge – usually articles in journals summarizing experiments – which means we do not reap the full benefits of research. Through GigaScience’s open access, open data journal and database we are entering a new era of publishing, where large amounts of scientific data are as accessible, citable and interconnected as the literature which they support. BioMed Central are delighted to be leading this revolution in open data and science communication with GigaScience and BGI, which we hope will ultimately help make scientific research faster and more reliable.”
As well as this innovative, big-data-driven publication format the journal also provides reviews and commentaries that address the many hurdles that still need to be surmounted to improve future big-data handling.
BioMed Central and the GigaScience editors will be marking the journal’s launch at the ISMB conference 15-17 July 2012 (booth number 36).
Public Relations Manager, BioMed Central
Tel: +44 (0) 20 3192 2433
Mob: +44 (0) 7825 257423
Notes to Editors
1. BioMed Central is an STM (Science, Technology and Medicine) publisher which has pioneered the open access publishing model. All peer-reviewed research articles published by BioMed Central are made immediately and freely accessible online, and are licensed to allow redistribution and reuse. BioMed Central is part of Springer Science+Business Media, a leading global publisher in the STM sector.
2. BGI (formerly known as Beijing Genomics Institute) was founded in 1999 and has since become the largest genomic organization in the world. With a focus on research and applications in the healthcare, agriculture, conservation, and bio-energy fields, BGI has a proven track record of innovative, high profile research, which has generated over 178 publications in top-tier journals such as Nature and Science. BGI’s distinguished achievements have made a great contribution to the development of genomics in both China and the world. Their goal is to make leading-edge genomics highly accessible to the global research community by integrating industry’s best technology, economies of scale, and expert bioinformatics resources. BGI and its affiliates, BGI Americas and BGI Europe, have established partnerships and collaborations with leading academic and government research institutions, as well as global biotechnology and pharmaceutical companies.
3. GigaScience is co-published by BGI, the world’s largest genomics institute, and BioMed Central, the world’s largest open-access publisher. The journal covers research that uses or produces ‘big data’ from the full spectrum of the life-sciences. It also serves as a forum for discussing the difficulties of and unique needs for handling large-scale data from all areas of the life sciences. The journal has a completely novel publication format — one that integrates manuscript publication with complete data hosting, and analyses tool incorporation. To encourage transparent reporting of scientific research as well as enable future access and analyses, it is a requirement of manuscript submission to GigaScience that all supporting data and source code be made available in the GigaScience database, GigaDB, as well as in their publicly available repositories. GigaScience will provide users access to associated online tools and workflows, and will be integrating cloud resources into the database later this year, maximizing the potential utility and re-use of data. (Follow us on twitter @GigaScience; and keep up-to-date on our blogs).