Freedom of Information Conference 2000
Dr Pieter Bolman Academic Press
The effects of open access on commercial publishers
What I'm
going to talk about is the commercial publishers' point of view. However
a lot of my talk is my personal opinion and I will not be necessarily
talking on behalf of my organisation. I've partitioned my talk into two;
the first is what users want and what they will get and the second is
whether information wants to be free.
What do
users want and what will they get?
Every research worker should have access to a refereed scientific journal
and other literature. There is a lot of emphasis on journals right now
and I want to de-emphasise that slightly by saying that all the facilities
of the internet should be made available to researchers with maximum efficiency
and effectiveness.
So what
is scientific literature?
Scientific literature is the primary information, the raw research results
that are usually published in journals. There is secondary information
that's actually the metadata, the way to get to the primary information.
And then there's what I call tertiary information. I know there are different
definitions for that but basically what I mean is the synthesis of primary
information on various levels - reviews, major reference works, encyclopaedias,
textbooks, and databases. These are mostly published in books but we do
have review journals as well.
So what
is a primary journal?
It is a collection of research articles in a discipline with a certain
quality stamp. There are about 10,000 STM journals, fully indexed in abstracting
indexing services. About 1 million articles and 25 million references
are added on an annual basis. One stands on the shoulders of giants, as
Newton once said, and as far as I know this is pretty unique for science.
Dr Varmus
mentioned briefly that there are two basic ways of gathering information.
There is browsing, which is following your nose and you do that typically
with tables of contents of journals or you, as I call it, go diagonally
through articles. You have a look at the references to see whether you
are quoted, that kind of thing! And then, of course you have, the search
- you know what you're looking for and you do that either directly or
via an automated profile.
In my own
experience as a researcher I spent more time browsing than searching.
Essentially what you did was go to the library, browse through journals,
and look at equations and figures, and you did that on a regular basis.
Only when you had to write something did you search. So browsing is a
very important aspect.
Anyway, we're
now on the web and how do we browse? We feel it should be possible; a
click should do the trick. But that requires something. It requires links,
e-links, between tertiary and primary literature, both backwards and forwards.
The links back from primary to secondary and primary to tertiary are actually
quite problematic. I think I used to call that link editing and it requires
added value of an intellectual kind. To achieve proper linkages you need
several things: a complete database of primary sources (the journals);
a complete distributed database of tertiary sources (journals, books,
databases, etc.); a database of metadata; direct links between all three;
and some search engines. Now, if we say that this is an ideal of what
we're looking for, it is actually still an e-analogue of what we have
in print but it certainly is a more efficient analogue. So let's see where
we are right now.
Well the
core journals are on the web. Practically all major secondary services
and some tertiary sources are on the web and the number is growing. By
that I mean there are now book publishers who put their books on the web,
who put their major reference works on the web, who structure them and
have them interlinked with each other and also have them linked to primary
literature.
PubMed
Central versus CrossRef
We have two
major initiatives going at the moment; PubMed Central and another initiative
called CrossRef. PubMed Central has a centralised approach and is funded
by the US government. CrossRef is a global, not US-based, initiative funded
by the private sector.
Well, let's
just talk aboutPubMed Central. The way I see it, and I'm sure my opinions
can be changed, it's only biomedical in scope, which I don't think anybody
will disagree with. It wants to make the information available for free
and that means two variables are being changed at the same time - not
only do you revolutionise the access model, you also revolutionise the
business model. That's asking for trouble. It relies on the existing biomedical
journal publishers to give up their files.
It's not
surprising, and, in fact, we have noticed it today, that there is some
considerable resistance from both the taxed and tax-exempt publishers.
The question really is, will journals commit economic suicide? And that's
the most existential question one can ask. It's something that's held
a lot of people back. Another problem with it is that it's a US Government
intervention in the private sector and you can ask yourself whether that's
desirable.
Well if we
are going to assess PubMed Central versus CrossRef then we'll see that
CrossRef has 700,000 articles annually and has 3 million articles interlinked
and available on various distributor databases. These are back articles
so to speak. I don't know how many PubMed Central has at the moment but
I've estimated 700 - because it looks so nice versus the 700,000 - and
perhaps 1,000 back articles. Now, if each publisher within CrossRef decides
on the terms of access and all articles remain interlinked, I would say
that because it's a distributed approach it's much easier to implement
and it's generically more flexible and also more scaleable.
My conclusion
is that CrossRef has a better chance to succeed. If you want to give free
access it can be realised within CrossRef for those who wish to participate.
And, although it has been planned before PubMed Central, the swift success
of CrossRef has certainly been catalysed by it. From this point of view
I think PubMed Central has served its purpose and I would invite them
to join us and just basically get on with it. We are talking about the
users who have to be satisfied and it's not something that should be dependent
on egos. So, there you are, this is an open invitation for PubMed Central
to join CrossRef and get it all over with and make sure things are happening
swiftly.
Does
information want to be free?
I've come
to the second part of my talk and that refers to what David Lipman said
about information wanting to be free. I must admit I found this very difficult
and I just scratched my head. It's very evocative thing to say and I wrote
down a few thoughts. You have to take them in the light hearted way in
which they were intended but my toes really started to curl. Is this a
starry-eyed egalitarian spoof, a communist plot perhaps, or is it ivory
tower academics on the loose? We have to realise that those who have proposed
this freedom are from very academic backgrounds.
We have to
realise that a lot of research funding in the biomedical area is from
industry and, of course, we don't always like it. When I started working
for North Holland I didn't particularly like it either. In fact my interviewer
asked me, "Don't you feel it's morally wrong and reprehensible?" Well,
I took the job so you can imagine what my answer was but it's something
that I think is a real dilemma and we all need to realise that.
What
is the role of the publisher?
Now, what
is the role of the publisher? When you are in a commercial company and
run lots of journals you have to make sure that they remain on track.
As you know, the editor might usually have a scientific background but
they're certainly not specialists. They have a hard time keeping journals
focused. It is labour-intensive and results sometimes are elusive.
You have
to make sure maximal access is given to scientists wherever they are,
and that whatever bells and whistles they require are going to be added.
I think it's very important that the archive is being created and maintained
in a linkable form and the only institution that can do this - given that
new information is being added all the time - is the publisher. It's a
new role for publishers and it has been taken on. Harold Varmus said this
might be something that needs to happen. It's already happening on a very
large scale.
Then of course
you have to develop and publish all kinds of what I call the sources of
consolidation and synthesis, reviews and all these kinds of things, the
tertiary information that I was talking about earlier.
The
system will favour those who pay
Basically
we are talking about an economic system and it tends to favour those who
pay. Therefore if the author is going to pay the system will lean towards
the author's goal and that is to get published; and if the readers pay,
the system will lean towards the readers' goal and that is effective filtering.
I think the latter is important because we're talking about the building
and maintenance of the scientific edifice and global accessibility. Now,
one can argue about a lot of things but I don't think you can argue that
the private sector has done a very good job, certainly the society and
commercial publishers have. Maybe nowadays some things are going awry
and that is certainly debatable, but if the commercial publishers had
not been there in the fifties and sixties, science would not have expanded
as quickly and as fast because societies were by no means ready, mentally
or otherwise, to take care of the enormous expansion in the literature,
especially in those days. Unfortunately government intervention is by
definition local not global, subject to politics etc.
Some
truths and some truisms
There's no
free lunch, somebody has to pay. Some academics have a deplorable disdain
for cost accounting, including me. I really had to be pulled in by the
hair to all these things when I first started working for a company. But
if it's free, if there's no accountability for costs, then the cause and
effects blur, and you can see that in government, you can see that in
companies and eventually someone will come in and say, "I just don't see
it anymore. Why the hell do we do this? Everyone has forgotten." And then
the knife is going to be put into it. Free access. A few things. Research
workers have usually never paid for access, let's face it. Occasionally
they get their own society journal, but the library pays not the research
worker, and since the web there is more access than ever before. The only
thing I can quote is from my own company, Academic Press, now with nine
million authorised users. We are in the process of giving access to Africa,
Eastern Europe, and Latin America. That is what technology can do and
that is what we're doing and I know that others are doing the same. So
the disenfranchised can be taken care of also in a different model, and
they are being taken care of.
Conclusions
So, as conclusions
I would say that the journal will continue to exist because it's a distinct
collection of certified articles. I haven't gone into that very deeply
but I would certainly not mind having a debate on that sometime, because
I think it is very important. The technology will enhance diversification
but whatever we are going to do we need to make sure that the stable forms
of the past, present, and future are compatible with eachother, because
it's one continuum that we're talking about. We can't all of a sudden
start all over again and forget about the past.
Two quotes.
A Mr Conway, I've forgotten who he was but I liked it so I'll just show
it to you. "The sweeping revolutionary change is more likely to result
in chaos than in success", which probably some people want but I think
most people don't. And "any plan that threatens the fundamental values
of scientific publishing should be avoided."
I want to
leave you with two remarks. One is "Free science versus free medicine:
is it better to have free access to the cause than to the cure?" I think
that is an important question that we have to ask ourselves, we can't
just run away from that simply because we happen to see a particular idea
and it attracts us. And secondly, I want to recommend a book to you which
I don't normally do. It's not published by Academic Press. It's called
"If you are an egalitarian, how come you are so rich?" It's been recently
published by a professor at the University of Oxford. I haven't read it,
I have just read a very extensive review of it and it's something that
I would recommend you to buy.
Questions
from the floor
Is comparing
CrossRef and PubMed disingenuous?
Questioner
1: I was a little surprised by what I would consider the false juxtaposition
of PubMed Central and CrossRef. We're a member of CrossRef and it has
a very different goal to PubMed. There's a lot more to scientific publishing
than just linking and CrossRef is wholly centred on interlinking. So I
was curious as to why you think they are equivalent? And also the other
thing is that the CrossRef system, the current economic model that is
set up, while it's not for profit it does prefer the publishers and not
the research community. There's a definite friction that's added to the
system by the costs for mapping from citation metadata for the case of
journal articles to otherwise undiscoverable digital object identifiers.
So I don't see how a proprietary system that's run by the publishers,
which also has a central component - which is the central metadata database
- is any different than PubMed Central or PubMed. But in fact the economic
model is completely different and there are barriers and real friction
introduced in CrossRef and I'm not sure why you see it as a cure-all or
something that's even remotely close to being equivalent to PubMed Central.
Pieter
Bolman: Okay, well, one of the things, because of the time, I could
not show was a higher phase, the phase two and phase three of where CrossRef
is going. What we want to do is create a distributive database of articles
that are full text articles and interlink them, and with search engines
make sure that you have cross-file searching. It's something that's also
been shown on the Open Archive Initiative, and that is the next phase.
Of course it is only for those that want to participate because everything
is voluntary here, but I think the basis is essentially the same. You
want to make sure that eventually the end user sees a complete virtual
file of everything that is available and has access to it only one or
two clicks away.
Questioner
1: Right. You can only call it an integrated database if access rights
are uniform across it. I think it's disingenuous to say that you're creating
a database if you are only linking across things with different access
rates and I disagree strongly. I think Harold Varmus did very well this
morning demonstrating that there are real access problems. If you are
talking from an Ivy League Institution it is an integrated database but
for many others it is not.
Pieter
Bolman: Obviously it is just a matter of where we are starting. We
have nothing on PubMed Central either. The point is how you start and
how do you get where you want to be. I think the objective is exactly
the same so the matter is how are you going to achieve that and that is
the way we feel, certainly in CrossRef, those of us who started it, this
is what we want to do. Now, not everybody who is a member of CrossRef
may agree, that's fine. It's a totally voluntary system. But it is likely
to be much more quickly successful than PubMed Central and that's all
I'm trying to say.
|