Freedom of Information Conference 2000
Harold Varmus Memorial Sloan-Kettering
Cancer Center
What will the new environment look like?
Most of us
are here today because we recognise that science and the publishing of
science in particular is undergoing a fundamental but also difficult transition.
A transition from a mode of publication that arguably began 500 years
ago with the publication of the first scientific journals and a transition
to a new mode that's made possible by computer science and the internet.
We all know and believe this transition is going to occur and I suspect
we're all in this room today because we believe it's going to occur in
the next decade or two. But all scientists and publishers of science are
going to have a strong influence over the pace and form this transition
takes, and I think that's the charge for today and for tomorrow; trying
to figure out how this transition occurs and how quickly it occurs. So,
in addition I think today is a time to consider where we are in the transition,
how the early steps have occurred and to think of corrective actions-just
as if we were sailing toward a distant port and had to attack and win.
Five goals for electronic publishing The first
goal seems to be to broaden and simplify access to scientific data. This,
of course, includes better dissemination and more thorough presentation
in the view of many of us. Access itself should be completely free and
that, of course, is an arguable point I'm sure will be discussed later.
But in any case, from the perspective of someone like me who's a biomedical
scientist, the point of increasing access is to accelerate the use of
scientific data for the benefit of public health.
The second
goal is economy and it's the conviction of many of us that in the long
run the use of electronic publishing will reduce costs of publishing overall,
although the distribution of expenses may be argued - that is, how various
components of scientific publishing share those costs, such as authors,
readers, publishers, scientific societies, institutions like the House
Scientist, Government agencies and libraries, and the degree to which
those components should share the costs is a debateable issue. But I think
it's fair to say that in the long run economies can be achieved.
The third
goal is quality. It seems to me that one of the possibilities electronic
publishing offers is to preserve and even improve the quality of the review
and comment about manuscripts. This is an issue that has been hotly discussed
because of the possibility of contaminating the worlds of literature through
random and unfiltered access to the distribution source. But it's my conviction
that we can sustain and create peer review journals and that we can find
new ways to critique and supplement papers in a way that will actually
improve quality through electronic means of distribution.
The fourth
goal no one argues with is innovation. There are ways through digital
presentation to improve the way we portray our results by making large
searchable data sets embedded in scientific manuscripts, by providing
pre-print servers, by using hypertexts, by using abundant digitalised
images and even movies, all of which can markedly enrich the reading experience.
And the final
point, of course, is retrieval. This is something we already experience
and benefit from the idea that we can archive, search, and retrieve reports
in powerful ways, placing large amounts of the scientific literature,
including past literature and, hopefully, present and future literature
into data bases of multiple sites with innovative search engines that
allow scientists to carry out that second reading function - the first
being browsing, the second being searching. The second probably for most
part the more important.
The
issues we overlook
There has
been resistance towards proposals to how we should sail towards these
goals and, as a highly visible author of manifestos for systems such as
E-BioMed, E-BioScience, and now PubMed Central, I've been one of those
people who's felt the resistance strongly. Now, before proposing how the
ship should sail in the short run I'd like to talk about three fundamental
issues that are often overlooked in the kinds of discussions we're having
today.
The first
issue, for brevity, I refer to as "writers versus authors."
The point here is to emphasise that a person describing scientific results
is motivated in a fundamentally different way from people writing other
kinds of prose. I think about this by thinking about my personal computer.
My wife and I spend a lot of time composing sentences on that computer
and thus we are both, in some sense, writers. But that's really where
the similarity ends. When she writes about philanthropy or about gardening
she hopes to sell her material. The texts for these ideas, its news, its
sentences are sold to a magazine, newspaper or book publisher who will
then sell them to interested readers. And she wants lots of readers because
every time a new reader is added she gets a larger payment. She has very
little interest in distributing her work for free because writing is what
she gets paid to do. She is, what we call, a "writer."
When I write,
especially when I write a scientific report, I feel more like someone
trying to disseminate results generated in the laboratory rather than
a real writer. I'm hoping my report will be favourably viewed by an editorial
board of a well-read and respected journal not because I want to collect
a fee but because that's how my work can best have an impact on other
scientists and best have an impact on the development of the field. After
all that's why the NIH and the American Cancer Society have paid for the
salaries, equipment, and supplies that allow me to do the research. So,
as an author of scientific reports, what I'm interested in is having my
work viewed by the greatest number of people who are likely to have some
interest in it. This is more likely to happen if it appears in a journal
that many people read because they respect the journal but they're also
especially likely to read it if they have access to it. None of us can
directly control the reputations of journals but we can improve access
to those journals. Expanded access then has a tremendous advantage to
authors and readers and the public that benefits from science.
The second
issue concerns the disenfranchised, a topic we're all aware of but not
spoken about enough. The effect of electronic distribution of scientific
information on the disenfranchised can be even more profound than some
of us think. In technologically advanced and wealthy countries such as
ours, scientific reports are not hard to get hold of. We have good libraries,
we can afford to subscribe, we use PubMed regularly, and often our institutions
subscribe to the technically excellent vendors like High Wire that make
an important selection of journals accessible to our institutional computers.
While these mechanisms may be imperfect they provide a good service and
frankly more reading material than most of us have time to deal with.
Outside of this narrow circle of affluence, however, access drops precipitously.
Even in the US and Europe many researcher institutions don't have subscriptions
to electronic texts or the full panoply of journals and most investigators
can't afford more than a few journals of their own. In the developing
countries the situation is dramatically worse and it seems especially
unfortunate when we recognise that putting scientific reports on-line
for the rich countries would tremendously benefit those in poor countries
that really have acute problems of access.
I recognise
this is not simply achieved by putting journals on-line for free. There
is a need to establish strong internet connections and to put technical
service on the ground. My only experience in overseeing research on malaria
in Africa has taught me this lesson. We had to send people from the National
Library of Medicine out to a variety of countries to try to establish
even a single sight of internet connectivity. In fact, the importance
of electronic publishing on the disenfranchised has not been overlooked
by Kofi Anan, the Secretary General of the UN. In his millennial statement
issued a few months ago he talks about digital bridges and, in particular,
on the impact of electronic distribution of information in the health
and science arena. He says, for example, that new technology offers an
unprecedented chance for developing countries to leapfrog earlier steps
of development and everything must be done to maximise their people's
access. He points out that information has unique attributes that have
tremendous potential benefit to some of these countries - steel, boots,
and other things are consumed, information is different, it can be made
available for multiple use and is not consumed. In fact, it's more valuable
the more it's used. He urges the policy-making world to understand how
the economy of information differs from the economy of scarce physical
goods and to use it to advance policy goals such as a new health inter-network
for developing countries to establish 10,000 on-line sites and to transmit
health and medical information tailored for specific countries. He also
announces a second digital bridges initiative called the United Nations
Information Technology Service that will be used to train groups in developing
countries to be the technical forces on the ground, to ensure that the
installation of these new tools will bring information in a sustained
way to these developing countries.
The third
issue concerns the filtering of information on the internet and the retention
of useful hierarchies. It's my conviction that distribution of scientific
information through electronic means does not mean having unregulated
or unqualified dissemination of information or a loss of a valued hierarchy
of information structure in science. In the responses that David Lipman,
Pat Brown, and I received to some of the proposals we put together last
year, the greatest passion was invested in this notion. I understand why
it's easy to be frightened about what's happening, through the internet,
in many disciplines and could happen in science. We shouldn't be naïve
about this.
This weekend
I was reading about a different politics and I was struck by a paragraph
on an essay by Dick Morris in a book about recent political tracks. The
writer said, and I quote, "Morris argues that modern technology has
made voters better informed than ever and thus better qualified to take
a more direct role in law making. I doubt this. There is certainly more
information available to more people but the internet has also removed
the traditional filters that once screened a good deal of nonsense out
of our national debate. On the web today one can read all manner of conspiracy
theories, baseless accusations, character assassinations, and economic
quackery. Many people think no doubt, this stuff could not be said if
not true, in some place there was an authority who would keep the record
straight. There is none." Now, obviously this concern - which is
a real one - pervades much of the discussion on scientific publishing.
We all know there's a tremendous amount of junk out there so there is
no doubt there is a problem. But I believe it can be controlled if we
are careful and there are many ways to do that; one is by credentialing
contributors to any electronic sites that are established for scientific
publishing, labelling entries with the explicit criteria that it used
for exclusion, retaining many of the filtering properties that currently
exist in print journals, editorial boards, peer review and many others.
Achieving
the transition
I'd like
to talk the last few minutes about what I referred to at the outset as
the transition, that is this potentially short, potentially prolonged
period in which we go from a world of paper-based journals to an electronic
world. One of the difficulties about making proclamations as the ones
that my colleagues and I made last year is that one tends to get rather
specific about how the final world should look and I think it's premature
to say that. Instead I'd like to think about what we ought to achieve
in the next couple of years to bring us closer to where I think we all
want to be.
First is
testing the idea of open access, to demonstrate what it feels like to
have an electronic distribution of a few important journals. I would contend
that the activities currently in progress with PubMed Central are an example
of the way in which we contest what open access is like.
The second
objective is data collection, developing data that establishes the technical
and financial boundaries of unfettered electronic distribution. Many of
the debates I've engaged in have been characterised by a paucity of information
about what is feasible and what the real costs are. This will allow more
solid proposals for operation of large systems and I would again argue
that many of these data collection exercises could be carried out in the
context of the PubMed Central experiment.
The third
objective is the initiation of some new electronic journals specifically
targeted for distribution by a mechanism like PubMed Central. It's no
secret that BioMed Central represents a collection of such journals and
I congratulate Vitek Tracj for making this leap into the unknown.
The fourth
issue is studying the consequences of starting such journals. How are
they received? What are the impacts on library costs? What are the attitudes
toward electronic journals published only electronically with free access
for the reader? What is the willingness of authors to publish in such
journals as opposed to paper journals of more limited circulation?
The fifth
issue is one I think is particularly important and that is the electronic
world has the potential for storing information published from the past
that has become more and more inaccessible with time. One of the greatest
things we can do is to build a large electronic archive of the important
literature in biomedical, chemical and allied research and that we can
do it without extraordinary expenditure - one that will definitely pay
off in the long run. But we need to begin to experiment with this, determine
how much it's going to cost and begin to line up potential financiers
to make it happen.
The sixth
issue is the creation of mirror sites. Ever since our first manifesto
we've been talking about having multiple sites and an international network
of distribution sources and archiving sites. It hasn't happened yet and
I think it's very important that our colleagues in Europe and Asia begin
to consider teaming up with PubMed Central to develop mirror sites so
we have assurance about storage and distribution and that we emphasize
the international aspect of scientific publishing.
The seventh
issue is one that is close to my heart but a controversial one, and that
is that we begin to experiment with a more porous filtering process for
scientific data that's presented through such means as PubMed Central.
We should do some experiments with screened but un-reviewed reports, perhaps
starting with things most highly acceptable for such distribution like
sequence comparisons, protein structures, big data sets, resource inventories,
and clearly indicate the material has not been through the traditional
peer review process. I firmly believe doing experiments of this kind will
present to the reading public some remarkable ways in which innovative
thinking about electronic publishing can markedly enrich the experience
of doing science in the modern age.
And of course
the final issue is a platitude but it's an important one, and that is
that we keep talking, we have meetings like this one in which people representing
the concerned constituencies keep the issues in the air by writing opinion
pieces, seeking support from groups outside the usual sets of people including
advocates for disease specific research, health workers, politicians,
and many others.
Questions from the floor
What exactly is PubMed Central?
Questioner
1: My name's Bob Simone from Stanford University (associate editor
of the Journal of Biological Chemistry). What exactly is PubMed Central?
I don't ask the question lightly. I've followed its development quite
closely. It was going to be a pre-print server, it was to be an archive,
it was to be a journal, and watching what's transpired since it started
six months ago, it's not obvious that it is any of these. It's also not
obvious where it's headed.
Harold
Varmus: First of all, I don't agree with your characterisation of
what it was intended to be before. No one ever argued this was going to
be a journal but I do agree with you that it has undergone changes, although
the basic vision has not changed.
The first
vision that was laid out was the vision of one set of blueprints but maybe
not the right set of blueprints and I think what we've progressed to is
a short term view of how a government agency can help us move toward a
world in which the long term goals can be achieved. What the short-term
view represents is a distribution site, in Pat's metaphor, a post office
for distribution, a public vehicle with government financing to help distribute
electronic information published in existing journals.
When would
they be presented to the public through these means? It could be at the
time of acceptance. It could be at the time of publication. It could be
a month, two months thereafter. Most journals that have contributed to
PubMed Central have elected to do so roughly a month after publication,
obviously designed to protect their subscriber base. One of the glorious
things about the way it's been set up is that it's intimately connected
to PubMed itself, which is a widely used search engine for titles and
sometimes for abstracts. So you go to PubMed, get your title and if that
journal happens to have provided its content to PubMed Central you have
immediate access to it. I'm not sure how many journals currently provide
material but it's something in the order of 20.
Questioner
1: So it is currently a repository?
Harold
Varmus: Well, it's also a distribution mechanism.
Questioner
1: What about the other components? I mean, there was a time when
there was to be a peer review component.
Harold
Varmus: There was to be. It was going to encourage the formation of
journals, a screening process that allowed people to post things but would
not be a peer review system run by the government.
Questioner
1: So it's currently a repository in the sense that it's posting and
distribution. What about the pre-print server, for example?
Harold
Varmus: Those are things that could happen in the future. There is
an advisory board which I and several people in this room serve that has
discussed the issue - whether to establish a pre-print server and take
on non-reviewed material that is screened but not reviewed. We decided
to defer that decision to a later time.
Will PubMed Central result in the end
of scientific societies?
Questioner
2: The internet offers some wonderful opportunities for scientific
societies and also risks if they make a false step. For many of these
societies the publication of a journal is their primary source of income.
If these societies lose that income it will wipe out their activities.
What will happen to science if these societies are not able to exist?
Harold
Varmus: Putting the question that way is very different from asking
how the societies can contribute to a beneficial development to do with
publishing. Science could not thrive without scientific societies, they're
incredibly important, they are our gills and no one interested in scientific
publishing thinks they should disappear. However, I do think societies
should not be reacting reflexively and defensively to proposals that would
benefit them by improving the environment in which scientific publishing
is done because they think they're going to lose revenue. Instead, other
kinds of solutions should be sought and I believe there are other solutions.
For example, it seems to me that most societies can generate revenues
in other ways and to say the existence of the society depends exclusively
on the generation of profits from publishing is to my mind a very unfortunate
thing to have to say, I don't believe it's true.
There are
societies that do well without making profits from journals. I think this
may require multiple years of changing the mind set, especially of young
scientists, towards the function of scientific societies. These societies
are our unions. They work for us, they do many things, and they don't
simply exist to publish journals. They exist to nurture the careers of
young scientists, to help in the public debates about the importance of
science, and to help the government see the wisdom of supporting science
through government agencies. All these activities are important and may
require raising dues. I can't imagine people failing to support their
society because the society has helped to advance innovations and publishing
to make science a richer experience but it will take time to change attitudes.
I know there
is a time when you are encouraged to join societies because you're told
that if you join a society you will get some journals at a cut rate. But
a change in society policy that promotes the completely unfettered access
towards all journals should be something societies agree to. I think we
may have to encourage a sense of citizenship and public responsibility
toward the support of scientific societies. I actually believe that is
realistic and can happen, assuming it is done in a measured way.
|