Towards a Science Commons
The sciences depend on access to and use of factual data. Powered by
developments in electronic storage and computational capability,
scientific inquiry today is becoming more data-intensive in almost
every discipline. Whether the field is meteorology, genomics, medicine,
ecology, or high-energy physics, modern research depends on the
availability of multiple databases, drawn from multiple public and
private sources; and the ability of those diverse databases to be
searched, recombined, and processed.
In the United States, this process has traditionally been supported by
a series of policies, laws, and practices that were largely invisible,
even to those who worked in the sciences themselves.
intellectual property law (and, until recently, the law of most
developed countries) did not allow for intellectual property protection
of "raw facts." One could patent the mousetrap, but not data on the
behavior of mice, or on the tensile strength of steel. The article
could be copyrighted but the data on which it rested could not be.
Commercial proprietary ownership was to be limited to a stage close to
the point where a finished product entered the marketplace. The data
upstream remained for all the world to work upon.
law mandated that even works that normally could be copyrighted, if
produced by the federal government, fell immediately into the public
domain - a provision of great importance given massive governmental
involvement in scientific research. More broadly, the practice in
federally funded scientific research was to encourage the widespread
dissemination of data at or below cost in the belief that, like the
interstate highway system, this provision of a public good would yield
incalculable economic benefits for society as a whole.
Third, in the
sciences themselves, and particularly in the universities, a strong
sociological tradition - sometimes called the Mertonian tradition of
open science - discouraged the proprietary exploitation of data itself
- as opposed to inventions derived from data, and required as a
condition of publication and replication, availability fro examination
of the datasets on which the work was based.
Each of these three central tenets is now either under attack, or
subject to serious reservations.
For example, in the realm of genetics, patent law has moved perilously
close to accepting an intellectual property right over raw facts - the
C's, G's A's and T's of a particular gene sequence.
In other areas, complex contracts of adhesion create de facto
intellectual property rights over databases, complete with "reach
through agreements" and multiple limitations on use. More disturbingly,
the US is considering, and the EU has already adopted, a "database
right." This “database right” actually does accord intellectual
property protection to facts - upsetting one of the most fundamental
premises of intellectual property – that one could never own facts or
ideas, but could exert ownership only over the inventions or
expressions yielded by their intersection.
The Federal government's role is also changing. Under the pressure of
the important and, in many ways, admirable Bayh-Dole statute, federally
funded research in universities is now pushed towards early proprietary
exploitation. Universities then become partners in privatizing and
exploiting the fruits of research. While this is a good idea when it
encourages the conversion of science into useful products brought to
market, it is much more questionable when the proprietary pressures
occur "upstream" at the most fundamental level of data and research. At
the same time, universities depend more and more on their intellectual
Under these twin pressures, the third leg of the tripod is also
beginning to crack. Science is not open when scientists are bound up in
confidentiality agreements, and when proprietary concerns limit or
prohibit the transfer of the full datasets on which they work. In this
atmosphere, institutions, often unconsciously, begin to encourage
secretive practices they formerly frowned on.
Science policy, too, begins to change as universities can no longer be
depended on to play the role of public defender for the public domain
that they traditionally played in the legislative realm. Around the
world, government departments may begin to look at datasets as a source
of revenue to be exploited, rather than a public good to be provided.
The important National Research Council study, Bits of Power, records
the tragic consequences that this tendency had in access to satellite
and weather data.
Many of the tendencies here involve both a collective action problem
and a race to the bottom. Universities as a whole might be better off
if more data were freely available. However, for an individual
university to pursue such a policy alone is hard, and sometimes
foolish: one is reluctant to give away that for which everyone else
attempts to charge a high price.
The same tendency occurs in different ways outside the university
setting. Local governments might attempt to sell data previously
gathered at taxpayer expense for public purposes, even though open
sharing of this data among communities and with citizens and businesses
might produce far greater overall long term economic and societal
rewards. Simply put, individual government departments do not
necessarily have incentives to try to make a deal that will benefit the
government or the public as a whole, rather then simply their specific
The Search for a Solution
These facts have not gone unnoticed. Numerous scientists have pointed
out the tragic irony that at the historical moment when we have the
technologies to permit worldwide availability and distributed
processing of scientific data, and the concomitant promise for
broadening collaboration and accelerating the pace and depth of
discovery, we are busy locking up data and slapping legal restrictions
on its transfer.
The learned societies, the National Academies of Sciences, the National
Science Foundation, and other groups have all expressed concern about
the trends that are developing right now. Much attention has been
focused on proposals for legislative change which, while important,
will be both extremely hard to push through, and be incomplete as a
solution. Any solution will be need to be as complex as the problem it
seeks to solve, which is to say it will be interdisciplinary,
multinational, and involve both public and private initiatives.
That is why Science Commons has convened this conference: to help
envision a way in which scientific data itself can be more open and
accessible to drive more discovery, more progress, more improvement of
life in the world in which we live.
Science Commons, a project of the non-profit Creative Commons, is the
sponsor and organizer of the Commons of Science Conference. Our goal is
to promote innovation in science by lowering the legal and technical
costs of the sharing and reuse of scientific work. We remove
unnecessary obstacles to scientific collaboration by creating voluntary
legal regimes for research and development.