This article is part of the supplement: Data publishing framework for primary biodiversity data
Towards mainstreaming of biodiversity data publishing: recommendations of the GBIF Data Publishing Framework Task Group
1 1968½ South Shenandoah Street, Los Angeles, California 90034-1208, USA
2 Aundh, Pune 411007, India
3 Zoology Microbiology Research Group, Zoology Department, Natural History Museum, Cromwell Road, London SW7 5BD, UK
4 Royal School of Library and Information Science, Birketinget 6, Copenhagen, DK 2300, Denmark
5 Oslo University College, Pb 4 St Olavs Plass, 0130 Oslo, Norway
6 Plazi, Zinggst. 16, 3600 Bern, Switzerland and American Museum of Natural History, Central Park West at 79th Street, New York NY 10024, USA
7 Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences and Pensoft Publishers, 13a Geomilev Street, 1111 Sophia, Bulgaria
8 BioMedCentral Ltd, Floor 6, 236 Gray's Inn Road, London WC1X 8HB, UK
9 Global Biodiversity Information Facility Secretariat, Universitetsparken 15, DK 2100, Copenhagen, Denmark
BMC Bioinformatics 2011, 12(Suppl 15):S1 doi:10.1186/1471-2105-12-S15-S1Published: 15 December 2011
Data are the evidentiary basis for scientific hypotheses, analyses and publication, for policy formation and for decision-making. They are essential to the evaluation and testing of results by peer scientists both present and future. There is broad consensus in the scientific and conservation communities that data should be freely, openly available in a sustained, persistent and secure way, and thus standards for 'free' and 'open' access to data have become well developed in recent years. The question of effective access to data remains highly problematic.
Specifically with respect to scientific publishing, the ability to critically evaluate a published scientific hypothesis or scientific report is contingent on the examination, analysis, evaluation - and if feasible - on the re-generation of data on which conclusions are based. It is not coincidental that in the recent 'climategate' controversies, the quality and integrity of data and their analytical treatment were central to the debate. There is recent evidence that even when scientific data are requested for evaluation they may not be available. The history of dissemination of scientific results has been marked by paradigm shifts driven by the emergence of new technologies. In recent decades, the advance of computer-based technology linked to global communications networks has created the potential for broader and more consistent dissemination of scientific information and data. Yet, in this digital era, scientists and conservationists, organizations and institutions have often been slow to make data available. Community studies suggest that the withholding of data can be attributed to a lack of awareness, to a lack of technical capacity, to concerns that data should be withheld for reasons of perceived personal or organizational self interest, or to lack of adequate mechanisms for attribution.
There is a clear need for institutionalization of a 'data publishing framework' that can address sociocultural, technical-infrastructural, policy, political and legal constraints, as well as addressing issues of sustainability and financial support. To address these aspects of a data publishing framework - a systematic, standard approach to the formal definition and public disclosure of data - in the context of biodiversity data, the Global Biodiversity Information Facility (GBIF, the single inter-governmental body most clearly mandated to undertake such an effort) convened a Data Publishing Framework Task Group. We conceive this data publishing framework as an environment conducive to ensure free and open access to world's biodiversity data. Here, we present the recommendations of that Task Group, which are intended to encourage free and open access to the worlds' biodiversity data.