Carole Goble on the importance of truly open data and the CC0 waiver

Posted by Biome on 4th September 2013 - 3 Comments

“If I have seen further, it is by standing on the shoulders of giants.”

Isaac Newton

The UK’s Royal Society is arguably the oldest learned society still in existence, and one of which Newton was a Fellow. In June 2012 it produced an influential report on Science as an Open Enterprise. The first of its ten recommendations states that “Scientists should communicate the data they collect and the models they create, to allow free and open access, and in ways that are intelligible, assessable and usable for other specialists.” There are numerous reasons why this is a good thing. One is that tenet of the scientific method – reproducibility or replication – backing up claims, auditing methods and validating results. Perhaps more often preached than practiced, we have recently seen much hand-wringing on the topic of open scrutiny. See Nature’s ‘cut out and keep’ collection of articles on the topic and a recent keynote of mine.

But where defence is a kind of backward-looking perspective, contribution is a forward-looking one. Scientific insight is gained by the pooling of data for (re)examination, combination and processing of data by others. The use of data is one of the points of its production. Scientific knowledge and understanding is a cumulative affair. The biology community should be rightly proud of its track record in providing, curating and contributing to core public data facilities, and of its data campaigns and data deposition declarations.

Josh Sommer, founder of the Chordoma Foundation, refers to ‘knowledge turning’ (the turns of a process to derive more good) between laboratories, by which he means the ability of information to smoothly move between researchers and feed into the knowledge-making process that is scientific research. Cameron Neylon, co-author of the Panton Principles for open data in science, argues that the success of this process depends on lowering frictions. One friction is licensing. Open Licensing helps oil the wheels of knowledge turning.

BioMed Central’s adoption of the Creative Commons CC0 waiver opens up the way that data published in their journals can be used, so that it can be freely mined, analysed, and reused. This step was a result of an open consultation where, encouragingly, six to one of respondents were in favour. The consultation report addressed many issues and one in particular is the friction potential of credit. Credit is important. Newton’s “giants” quote is actually part of the discussion with Hooke about scientific credit.

The CC0 waiver to published data means that legally there is no requirement for author attribution. But a norm of science is reciprocity. It’s credit where credit is due. It’s being a good citizen. Citation is not like money, where we have a limited pile of credits that we might run out of. Citation is, instead, like love.

The quid quo pro tenet of science means that if data producers have made their data openly licensed then its consumers should credit them wherever possible. Please do. For if we are to see further, it is by standing shoulder to shoulder with each other.


More about the author(s)

Carole Goble, professor of computer science, University of Manchester.

Carole Goble is a professor of computer science at the University of Manchester, UK, where she has co-led the Information Management Group since 1997, carrying out research into the design, development and use of data and knowledge management systems. Goble also leads the myGrid project, the UK’s largest e-Science pilot project that is now part of the Open Middleware Infrastructure Institute, and is also co-director of the e-Science North West regional centre. Her research interests include the Semantic Web, medical informatics, e-Science, Grid computing, and bioinformatics. Goble produced the first reference architecture for the Semantic Grid (S-OGSA) through the Ontogrid project and co-chairs the Open Grid Forum Semantic Grid Group.