From deadly E. coli to endangered polar bears: GigaScience provides first citable data
06 Jul 2011
BioMed Central and BGI launch a new integrated database and journal, to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data.”
GigaScience, an innovative new journal and integrated database to be launched by BioMed Central in November 2011, has released their first datasets to be given a Digital Object Identifier (DOI). This enables a long-needed way to properly recognize the data producers who have provided an untold number of essential resources to the entire research community. This not only promotes very rapid data release, but also provides easy access, reuse, tracking, and most importantly permanency for such datasets. The journal is being launched by a collaboration between BGI, the world’s largest genomics institute, and open access publisher BioMed Central, a leader in scientific data sharing and open data.
The datasets, created by BGI and its collaborators in Germany and in the Genome10K project, include the sequence and assembly data from the recent deadly outbreak strain E. coli O104, and 7 large vertebrates, including the giant panda, which is in great danger of extinction; the chinese rhesus and crab-eating cynomolous macaques, which are commonly used biomedical animal models; the polar bear and the emperor and adelie penguins, which live in extremely hostile environments; and the domestic pigeon, which has unusually accurate navigation abilities. The datasets have been assigned Digital Object Identifiers (DOIs) to enable other scientists to cite the datasets, in the same manner as scientific papers.
GigaScience has been working with the DataCite organization and the British Library to enable datasets to be given DOIs. A primary goal for creating dataset DOIs is to promote extremely rapid data release and dissemination. And, in keeping with this, the majority of the above datasets are available prior to the publication of their associated scientific journal articles. Given that such publication is currently the only effective means for data producers to obtain credit for their work, this normally creates extensive delays in data availability due to the long writing, reviewing, and editing processes needed for article publication. This can seriously impede the speed at which scientific discoveries are made.
The importance of free and faster methods of data release was made particularly clear in the recent deadly E. coli outbreak in Europe. To aid the fight against the outbreak, BGI and their collaborators at the University Medical Centre Hamburg-Eppendorf rapidly sequenced the pathogen’s genome and immediately released the data. In conjunction with publicizing the rapid release through Twitter, the entire scientific community began to “crowdsource” the data with little to no delay between data production and release. These data were also released without restrictions on its use, under a public domain license. This, in conjunction with having a DOI, marks the first time a genome has been released in this way. Additionally, the growing team of researchers utilizing and adding invaluable information to the pool of data have now confirmed they will also release their work in the same manner.
Speaking of the announcement, Iain Hrynaszkiewicz, Journal Publisher with BioMed Central said: "All the articles published by GigaScience will be freely available for readers and all the data open and reproducible for researchers. An online journal and integrated database enhances the functionality and reliability of the scientific literature and demonstrates real leadership in openness in science. BioMed Central believes open access should include, wherever possible, data as well as papers, and we're delighted to be working with BGI to make data a truly first-class citizen in publishing."
Head of Public Relations, BioMed Central
Tel: +44 (0) 20 3192 2216
Mob : +44 (0) 7825 257 423
Notes to Editors
1. The BGI portal page and the Github repository provide the latest open research from the E. coli community.
2. Newly released animal genomes are available from the Genome10K website.
3. GigaScience will publish articles relating to biological and biomedical “big-data” studies, and will provide a forum for dealing with the difficulties of handling large-scale data from all areas of the life sciences. The journal will have a completely novel publication format, one that integrates manuscript publication with complete data hosting. To encourage transparent reporting of scientific research as well as enable future access and analyses, it is a requirement of manuscript submission to GigaScience that all supporting data and source code be made available in the GigaScience database and other relevant repositories. Drawing upon, and integrating into, BGI’s extensive data-hosting and cloud-computing infrastructure, GigaScience will provide users access to associated online tools and workflows, maximizing the potential utility and re-use of data.
4. BGI (formerly known as Beijing Genomics Institute) was founded in 1999 and has since become the largest genomic organization in the world. With a focus on research and applications in the healthcare, agriculture, conservation and bio-energy fields, BGI has a proven track record of innovative, high profile research, which has generated over 178 publications in top-tier journals such as Nature and Science. BGI’s distinguished achievements have made a great contribution to the development of genomics in both China and the world. Our goal is to make leading-edge genomics highly accessible to the global research community by leveraging industry’s best technology, economies of scale and expert bioinformatics resources. BGI and its affiliates, BGI Americas and BGI Europe, have established partnerships and collaborations with leading academic and government research institutions, as well as global biotechnology and pharmaceutical companies. At BGI, we have built the infrastructure and scientific expertise to enable our customers and collaborators to quickly migrate from samples to discovery.