Edit: A summary of the recent Open Science session at the Berlin Colloquium on Internet and Society with talks from Constanze Engelbrecht, Pasco Bilic and Christoph Lutz has been posted by the Humboldt Institute’s Benedikt Fecher (german version, english version). The text below is a more general discussion of how Open Science can be defined.

One area of research at the Alexander von Humboldt Institute is Open Science, an emerging term used to describe new ways of conducting research and communicating its results through the Internet. There is no single definition of what constitutes Open Science (and one could argue there doesn’t really need to be), but in this blog entry I want to point to attempts to define the term by prominent scientists and activists, and discuss some of the limitations of these definitions. I’ll summarize my observations in the form of five questions that suggest a direction that future research into Open Science could take.

Open Science: a few working definitions

Michael Nielsen – a prominent scientist and author on Open Science whose name pops up invariently when discussing the topic – provides this very comprehensive definition in a post to the Open Science mailing list:

“Open science is the idea that scientific knowledge of all kinds should be openly shared as early as is practical in the discovery process.”

In the same vein, Peter Murray-Rust, a professor in molecular chemistry and Open Access advocate, provides another definition (also through the OKFN’s open science mailing list):

“In a full open science process the major part of the research would have been posted openly and would potentially have been available to people outside the research group both for reading and comment.”

(Also see this interview if you want a more detailed exposition).

Finally, Jean Claude Bradley, also a professor in chemistry, provides a definition of what he calles Open Notebook Science, a very similar approach:

“[In Open Notebook Science] there is a URL to a laboratory notebook that is freely available and indexed on common search engines. It does not necessarily have to look like a paper notebook but it is essential that all of the information available to the researchers to make their conclusions is equally available to the rest of the world.”

(Here’s a presentation summarizing his approach, Open Notebook Science. A similar view is articulated by M. Fabiana Kubke and Daniel Mietchen in this video, though they prefer the term Open Research.)

From natural philosophy to science

One thing that these different definitions have in common is the way in which they frame science. In English, the word science has come to denote primarily the natural sciences (traditionally physics and chemistry, more recently also biology and life sciences). The history of the term is long and complex (check out the Wikipedia entry), but as a result of language change, a wide range of disciplines are considered not to be part of the sciences, but instead belong to the social sciences and Humanities.

Why does this matter? The above definitions are very closely tailored to the methods and organizational structures of the natural sciences. They assume that research is conducted in a research group (Murray-Rust) that works primarily in a laboratory and whose members record the steps of an experimental process in a lab notebook (Bradley), following a sequence of more or less clearly-structured steps that can be summarized as “the discovery process” (Nielsen).

Research processes in other fields strongly differ from this approach, not just in the Humanities (where there is frequently no research group, and data is of varying relevance), but also in the social sciences (where there is generally no laboratory, and data frequently comes from human subjects, rather than technical instruments such as radio telescopes or DNA sequencers). Beyond just using different tools, the instruments also shape the assumptions of their users about the world, and about what they do.  Sociologist Karin Knorr-Cetina points to this difference in the title of her book Epistemic Cultures, and similar observations have been made in Bruno Latour & Steve Woolgar’s Laboratory Life: The Construction of Scientific Facts. One crucial aspect of this is how data is conceptualized in the different disciplinary perspectives, and, related to this, how notions differ regarding what openness means.

Openness beyond open access to publications

Openness can be defined in a variety of ways. Not all information that is available online is open in a technical sense – just think about proprietary file formats that make it difficult to share and re-use data. Technical openness does not equal legal openness, a problem that is also on the institute’s agenda.

Open Access – the technical and legal accessibility of scholarly publications via the Internet – is widely regarded to benefit both science and the public at large. In the traditional publishing model access to research results in scholarly monographs and journals is available to subscribers only (usually institutional subscribers, in other words, libraries). The Open Access model shifts the costs, sometimes to authors (who pay a fee to publish) or to publishing funds or other institutional actors. The Budapest and Berlin Declarations on Open Access specify under which provisions publications are truly Open Access, rather than just somehow accessible. Open Access has a range of benefits, from reducing costs and providing access to scientists at small universities and in developing countries, to increasing transparency and raising scholarly impact. Models based on author fees, such as the one utilized by PLoS, are increasingly common and make Open Access economically feasible.

There is broad consensus that Open Access is a first step, but that it’s not enough. Many scientists, such as the ones cited above, call for research data also to be made available more broadly. Sharing research data, instead of packaging data and analysis together in scholarly articles, could enable new forms of research that are much more complementary than current practices, which tend to emphasize positive outcomes (experiments that worked) over negative ones (those that didn’t), despite the fact that negative outcomes can greatly contribute to better understanding a problem.

Making openness count

The barriers to achieving a more open environment in regards to research data aren’t primarily technical or legal, but cultural. Research has always been based on the open dissemination of knowledge (just take the history of the Philosophical Transactions, considered by most to be the oldest scientific journal), but it is also very closely tied to the formats in which knowledge is stored and disseminated, such as books, journal articles, and conference papers, which tend to take on a valorizing role, rather than being just arbitrary containers of scholarly information. Many scholars, regardless of their field, see themselves in the business of publishing books, articles, and papers just as much as they consider themselves to be in the business of doing research. While the technology behind scholarly publishing has changed dramatically, the concepts have not changed. Because institutionalized academia is incentive-driven and highly competetive, collective goals (a more efficient approach to knowledge production) are trumped by individual ones (more highly-ranked publications = more funding and promotions for the individual researcher).

Institutional academia is no longer the only place where research happens. Increasingly, there is (if latently) competition from crowdsourcing platforms that facilitate collaborative knowledge creation (and, more open, problem solving) outside of institutional contexts. Depending on how you define the process of knowledge production, examples include both Wikipedia and projects such as the #SciFund Challenge. The approach to knowledge production in these environments seems to focus on knowledge recombination and remixing at the moment, but it appears plausible that more sophisticated models could arise in the future. Whether the hybrid communities of knowledge production have a potential to displace established institutional academia remains to be seen. Rather, such communities could blossom in those areas where traditional academia fails to deliver.

But even inside institutional academia, the time seems ripe for more openness beyond making publications and data available to other academics. Social media makes it possible for scholars to both communicate with their peers and engage with the public more directly — though they are still hesitant to do either at the moment. Public visibility is not as high on the agenda of most researchers as one might expect, because academic success is largely the result of peer, not popular evaluation.

Redefining scholarly impact

This may change as new, more open measurements of scholarly impact enter the mainstream. Measuring and evaluating the impact and quality of publicly-funded research has been a key political interest for decades. While frameworks exist for conducting large and complex evaluations (Research Assessment Exercises in the UK, Exzellenzinitiative in Germany) the metrics used to evaluate the performance of researchers are generally criticized as too one-dimensional. This criticism applies in particular to measuerments that indicate the quality of publications such as Thompson Reuters’ Impact Factor (IF). A confluence of measures (downloads, views, incoming links) could change the current, extremely one-sided approach to evaluation and make it more holistic, generating a more nuanced picture of scholarly performance.

Questions for research into Open Science

The following questions reflect some of the issues raised by “open” approaches to science and scholarship. They are by no means the only ones, als the Open Science project description on the pages of the HIIG highlights, but reflect my personal take.

  1. How can Open Science be conceptualized in ways that reach beyond the paradigm of the natural sciences? In other words, what should Open Humanities and Open Social Sciences look like?
  2. How do different types of data (recorded by machines, created by human subjects, classified and categorized by experts) and diverse methods used for interacting with it (close reading, qualitative analysis, hermeneutics, statistical approaches, data mining, machine learning) impact knowledge creation and what are their respective potentials for openness in the sense described by Nielsen, Murray-Rust and Bradley? What are limits to openness, e.g. for ethical, economic and political reasons?
  3. What are features of academic openness beyond open access (e.g. availability of data, talks, teaching materials, social media media presence, public outreach activities) and how do they apply differently to different disciplines?
  4. How can the above-mentioned features be used for a facetted, holistic evaluation of scholarly impact that goes beyond a single metric (in other words, that measures visibility, transparency and participation in both scientific and public contexts)?
  5. What is the relationship between institutionalized academia and hybrid virtual communities and platforms? Are they competitive or complementary? How do their approaches to knowledge production and the incentives they offer to the individual differ?


To clear up confusion, I will use the term Open Notebook Science, which has not yet suffered meme mutation. By this I mean that there is a URL to a laboratory notebook (http://usefulchem.wikispaces.com/Exp025) that is freely available and indexed on common search engines. It does not necessarily have to look like a paper notebook but it is essential that all of the information available to the researchers to make their conclusions is equally available to the rest of the world. Basically, no insider information.