I just came back from Deidesheim, a small town (yet with an oddly epic entry in English Wikipedia — what’s up with that?) located in an area of Germany best known for its excellent Riesling, where I participated in the annual meeting of the SciLogs blogging community. My role in Deidesheim, together with my colleague Merja Mahrt, was to nominate a blogger for the SciLogs ’12 best blog award (here’s the winner!) and to give a talk on research on scholarly blogging (slides below).

The past ten days have been a whirlwind tour of sorts, with no less than three (!) different events related to scholarly/science/research blogging that I attended, and I want to take a moment and reflect on some of the things that were discussed and record a few thoughts they provoked.

So let’s start with a list of the events.

Weblogs in the Humanities, Munich

Picture of me during my talk at 'Weblogs in the Humanities'. Photo by Wenke Bönisch.

Last week, I presented at the conference Weblogs in den Geisteswissenschaften (Weblogs in the Humanities), organized by the Deutsche Historische Institut Paris and supported by hypotheses.org, a platform operated by Cléo, a section of the CNRS. The newly launched portal de.hypotheses.org is aimed at the German-speaking scholarly community and follows the model of its French parent. Weblogs (or carnets de recherche, as they are branded under the hypotheses label) are more widely read in France than they are in Germany, a factor which I think partly explains their uptake. Another key to their success seems to be the way they are supported, for example, each blog is provided with an ISSN, making it easier to cite. As part of the editorial team behind de.hypotheses.org, I’m excited to see whether the platform will succeed and follow in the footsteps of its French counterpart, which hosts an impressive 300 scholarly blogs. The conference was certainly an indicator that the topic is on a lot of people’s radar. More detailed reports from the event can be found here (keynote speaker Melissa Terras, in English), here (Wenke Bönisch, in German) and here (Anton Tantner, also in German). During a break, I had the chance to interview Melissa for my postdoc project and was myself interviewed for the German Humanities portal LISA. Thank You to Melissa for taking the time to chat with me and to Georgios Chatzoudis for asking some very thought-provoking questions!

VIDEO OF TALKAUDIO INTERVIEW

Symposia on e-Social Science, Oxford

Next I flew to England for the first time in several years, to visit the Oxford Internet Institute and attend two events, Social Science and Digital Research: Interdisciplinary Insights and Digital Social Research: A Forum for Policy and Practice. There was also a dinner on Monday to mark the formal ending of the Oxford e-Social Science Project and a breakfast on Tuesday morning, where the Euorpean Commission’s SESERV project was discussed and recommendations on how to integrate e-social science methods into teaching and research more closely were formulated. All of theses events were related to the Oxford e-Social Science project in one way or another, therefore the aspect of digital scholarly communication was just one facet of that larger theme. People had a broad discussion of research and teaching practices in the social sciences and how e-science fits into the mix. I found Christine Borgmann‘s keynote on reproducibility very thought-provoking. We take it for grandted that open data will make the research process more transparent and hopefully this is true, but what reproducibility actually amounts to is widely contested and especially tricky in the context of the human and social sciences.

SciLogs Meeting 2012, Deidesheim

Bloggers chatting at scilogs12 in Deidesheim.

After my visit to Oxford, it was on to Deidesheim via Düsseldorf. The SciLogs meetup was yet a different event than both the Munich conference and the symposia in England. SciLogs is comparable to scienceblogs.com in that it’s run by a publisher (Spektrum der Wissenschaft), who has launched it largely as a source of popular science content and that it has an orientation towards the natural sciences (though there are also blogs on history, linguistics and a variety of other fields). It was exciting to chat with people who have been involved in science blogging for years and to learn more about what drives them. I was particularly impressed by the enthusiasm that the sciloggers have for their blogs and their readers. Blogging is hard work (as the gracial pace of my postings here illustrates…) and Spektrum Verlag can be quite proud of the community it has built around the idea of better informing people about scientific research.

Below are some somewhat random points that I found noteworthy.

Scholarly/academic/science/research blogs are written by a wide variety of people (e.g. scholars, journalists, librarians, science enthusiasts), for a wide range of audiences (e.g. self, peers, people in the same field, practitioners, politicians, general public) with a variety of purposes in mind (self-fulfillment, knowledge management). It’s important not just to regard them exclusively as a form of science communication, but to see the many roles they take on for a range of users.

Just as scholarly bloggers and their topics are a diverse bunch, readers and commentators of S/a/s/r blogs have different reasons for visiting and participating. A key motivation among commentators could be that they can add their view to a post. This may seem obvious, but it’s interesting for several reasons. For example, there is fairly little dialogue going on in posts that have a lot of comments. The commentators simply add their take and then leave, without engaging with the blogger or with each other. Debates that do have a lot of actualy discussion sometimes devolve into arguments between individual users that have little to do with the original post. This isn’t a bad thing, but it illustrates that it’s a bad idea to give in to the temptation that a large number of comments translates into success. Or, perhaps speaking of personal success is alright as long as one doesn’t mistake it for societal impact. Another thing is the relation of commentators and readers. It’s not trivial to figure out whether fewer comments means less attention on the part of readers.

In order to play a role in main-stream scholarly communication, as it is still conducted primarily via monographs and journals, scholarly blogging must integrate some of the conventions that exist in these forms (quality control, long-term availability of content, citability), if it is to succeed as a formal genre of scholarly communication, while preserving its intrinsic strengths (speed and simplicity of publication, the potential of interaction via comments, the ability to embed images, video, audio, data and code, the ability to link and quote, the ability to track one’s impact via metrics). The adaptation can happen in multiple ways, and it only applies to formal scholarly communication — what happens informally or for other purposes remains uneffected, as does blogging about science by journalists or hobbyists. Blogging about science and scholarship is obviously in the public interest. The question is, should this be left to the researcher, or should it be incentivized by institutions? As Klaus Graf put it somewhat radically at the conference in Munich, is a researcher who doesn’t blog a bad researcher?

I’ve already shared this bit of personal news with a few friends and colleagues, but I thought I’d blog about it as well — especially since I’m woefully behind on my Iron Blogger schedule. ;-)

After a fairly long time in the making, I have been awarded a three-year research grant from the Deutsche Forschungsgemeinschaft (DFG) for the project Networking, visibility, information: a study of digital genres of scholarly communication and the motives of their users (summary in German on the DFG’s site). The project investigates new forms of scholarly communication (especially blogging and Twitter) and their role for academia. My key concerns are usage motives, i.e. why scholars use blogs and Twitter, and how these motives correspond with usage practices (how they blog and tweet), rather than how many researchers use these channels of communication or what makes them refrain from using them (see this blog post and the study mentioned in it for that kind of work). My main methods will be qualitative interviews with a sample of 20-25 blogging and/or tweeting academics, along with in-depth content analysis of the material they post in these channels over a prolonged period (>1 year). Identifying usage patterns and relating them to the participants’ narrative about their use will be another key objective. Ultimately, I hope to find a (tentative) answer to the question what role blogs and Twitter may play for the future of digital scholarship, and whether they will remain a niche phenomenon or become mainstream over time.

The project follows up on my work on corporate blogging and connects strongly to what we have been doing at the Junior Researchers Group “Science and the Internet” over the past year, but the focus on interviews should result in a more user-centric analysis. As someone who has been doing (applied) linguistic analysis to make inferences about social processes, I feel much more comfortable actually talking to the people I want to study, rather than just crunching numbers on how they tweet. Big data social science research is obviously and understandably en vogue these days, but I hope to find a good synergy between qualitative and quantitative approaches in my project.

My new institutional home for the next three years will be the Berlin School of Library and Information Science at Humboldt University. I’m grateful to Michael Seadle for supporting my project and really look forward to working with my new colleagues at IBI (that’s the German acronym, which, as far as I can tell, is preferred to its more entertaining English equivalent). I also look forward to working with colleagues from the Alexander von Humboldt Institute for Internet and Society (HIIG) where I’m currently supporting the project Regulation Watch. Finally, I plan to keep in close contact with the colleagues in Düsseldorf, both at the Junior Researchers Group and the Department of English Language and Linguistics, where I have learned virtually everything I know about being a researcher. I am especially indebted to Dieter Stein for his enduring support and for his contagious enthusiasm for all aspects of scholarship.

Sic itur ad astra! :-)

For an overview of previous work I’ve done in this direction, have a look at my publications.

Tagged with:  

Those of you following my occasional updates here know that I have previously posted code for graphing Twitter friend/follower networks using R (post #1. post #2). Kai Heinrich was kind enough to send me some updated code for doing so using a newer version of the extremely useful twitteR package. His very crisp, yet thoroughly documented script is pasted below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# Script for graphing Twitter friends/followers
# by Kai Heinrich (kai.heinrich@mailbox.tu-dresden.de) 
 
# load the required packages
 
library("twitteR")
library("igraph")
 
# HINT: In order for the tkplot() function to work on mac you need to install 
#       the TCL/TK build for X11 
#       (get it here: http://cran.us.r-project.org/bin/macosx/tools/)
#
# Get User Information with twitteR function getUSer(), 
#  instead of using ur name you can do this with any other username as well 
 
start<-getUser("YOUR_USERNAME") 
 
# Get Friends and Follower names with first fetching IDs (getFollowerIDs(),getFriendIDs()) 
and then looking up the names (lookupUsers()) 
 
friends.object<-lookupUsers(start$getFriendIDs())
follower.object<-lookupUsers(start$getFollowerIDs())
 
# Retrieve the names of your friends and followers from the friend
# and follower objects. You can limit the number of friends and followers by adjusting the 
# size of the selected data with [1:n], where n is the number of followers/friends 
# that you want to visualize. If you do not put in the expression the maximum number of 
# friends and/or followers will be visualized.
 
n<-20 
friends <- sapply(friends.object[1:n],name)
followers <- sapply(followers.object[1:n],name)
 
# Create a data frame that relates friends and followers to you for expression in the graph
relations <- merge(data.frame(User='YOUR_NAME', Follower=friends), 
data.frame(User=followers, Follower='YOUR_NAME'), all=T)
 
# Create graph from relations.
g <- graph.data.frame(relations, directed = T)
 
# Assign labels to the graph (=people's names)
V(g)$label <- V(g)$name
 
# Plot the graph using plot() or tkplot(). Remember the HINT at the 
# beginning if you are using MAC OS/X
tkplot(g)
Tagged with:  
Tagged with:  

Berlin 9: The Worldwide Policy Environment (Wednesday)

On November 10, 2011, in Events, by cornelius

Avice Meehan moderated the first session of the Berlin 9 Open Access conference session on The Worldwide Policy Environment. She introduced the three presenters:

  • Jean-François Dechamp, Policy Officer, European Commission, Directorate-General for Research and Innovation
  • Harold Varmus, Director, U.S. National Cancer Institute
  • Cyril Muller, Vice President, External Affairs Department, The World Bank

After a brief introduction by Avice, Jean-Francois Dechamp took to the podium, to talk about the European policy context of Open Access. Jean-Francois described how the European Commission acts as a policy maker, a funding agency, and as an infrastructure funder and capacity builder. He cited Commission documents stating that “publicly funded research should be open access” and the noted that the Commission aims to to make Open Access to publicatons “the gerade principle for projects funded by the EU research Framework Programmes”. Key reasons for the European Commission to support Open Access include to serve science and research, benefit innovation and improve return on investment in R&D. OA publishing costs (article charges) are covered by FP7, although fairly few researchers realize this. Dechamp cited a study conducted by the EUC where the majority of researchers involved indicated that they were ready to self-archive, but that the legal challenges were daunting. He cited a soon-to-be-released study (ERAC, 2010-2011) that found that the overall significance of OA in the member states has significantly increased over the past few years.

Harold Varmus of the U.S. National Cancer Institute and NIH came next. Harold stressed that he was not speaking as the representative of a policy-making institution, but as a scientist. He lamented that the shift towards OA is not happening fast enough and asked for a broader idea of Open Access that must go beyond access to publications, to access to data and (ultimately) knowledge. True Open Access, according to Harold, means gold road OA, in accord with the Berlin Declaration — embargos aren’t good enough. Harold traced his contact with OA to 1998 when he heard about arXiv (built by Paul Ginsparg) and thought that such a resource should also exist for biomedicine. He went on to emphasize that different fields have different needs, and that publishing must be sensitive to these needs. Harold also stressed the success of Pubmed Central, with a size of now 2 mio articles. In 2006 publishers were encouraged to donate articles (with limited success), in 2008 a mandate was introduced to publish NIH-supported research on PubmedCentral after an embargo period. Harold noted that economics are essential and that there’s always a business plan attached to journals. He noted that while researchers love their publishers, they love the people who give them money even more, pointing to the central influence of funders in relation to OA. Harold noted the success of PLoS, specifically of PLoS ONE. He further echoed Cathy Norton’s observation that the public at large wants access — not just abstracts and titles, but the actual data. While articles are the best product of academic research, they are also emotionally laden. Harold noted that while funders see articles as mere vehicles of knowledge, authors also write for fame and prestige, not just to contribute to knowledge. He closed by arguing strongly for a new regime of review (post rather than pre). Authors should be forced to list their most important contributions rather than bean counting by relying on long publication lists and the impact factor.

Cyril Muller approached the topic differntly in his talk, focusing on the Open Data Approach of his insitution, the World bank, and on the positive effects that they had observed in making the data collected by them digitally available. He described the three pillars of their approach (Open Data, Open Knowledge, Open Solutions) and presented statistics on how much information were now made available online, rather than in print via their Open Knowledge Repository. He provided interesting examples of information-enabled innovation in Africa and elsewhere. My notes are unfortunately somewhat incomplete on Cyril’s talk, but it really focused on Open (Government) Data more than on Open Access (to Scholarly Publications), putting it more into a thematic camp with a variety of initiatives from that direction.

Berlin 9: Opening session (Wednesday)

On November 10, 2011, in Events, by cornelius

There are my notes from the opening session of the Berlin 9 Open Access Conference. I’ve already blogged the pre-conference workshops on Open Access Publishing and Open Access Policy.

The conference opened with welcoming remarks, first from the HHMI’s VP and chief scientific officer Jack E. Dixon, then from HHMI’s head Robert Tijan, followed by the Max Planck Society’s Bernard Schutz, and finally from the Marine Biology Lab’s Cathy Norton. Jack Dixon struck an optimistic note, observing that “the tide is turning, in a very positive way.” Robert Tijan observed that those who fund research should be more active in publishing, a reference to eLife, a new Open Access journal in the lifes sciences jointly launched by HHMI and the Max Planck Society. He went on to note that “scientific work is not complete before the results become accessible… what we do doesn’t have any impact otherwise.” Bernard Schutz focused on the development of the Berlin Declartion in his talk. 30 institutions had been original signatories in 2003 when the Declaration was first drafted, 338 institutions are now among the signatories. A global expansion of the Berlin meetings from Europe (Berlin 1 to Berlin 7) to the world (Berlin 8 in China, Berlin 9 in the U.S.) had been vital, because “research and publishing are glibal issues”. Bernard noted that much had been achieved in relation to green road OA and repositories, but that the Max Planck Society regards the popularization of gold road open access as an important achievement for the future. He went on to note that interdisciplinarity and innovation (e.g. in business) are enabled by OA. Free information is a common good and the spread of knowledge to stakeholders outside academia (teachers and students) is enabled by OA. Bernard observed that to many publishers “the business model is less important than the business itself” and that many publishers would transition to OA if viable business models could be established. He decribed disagreements between publishers, institutions, and researchers in some areas and stressed that the Max Planck Society is ready to work with all stakeholders on the issues at hand. Finally, he stated “we want to become more inclusive” and characterized Open Access as part of a larger movement towards (more) Free Information.

Cathy Norton from the Marine Biology Lab focused on issues close to her field in her talk. She discussed the success of MedLine and pointed out how interested the public is in certain areas of scientific information. The future of medicine, according to Cathy, lies in personalization of drugs and treatments, something that can only be achieved by having large volumes of data freely available. Techniques such as text mining and visual search are key to utilizing such new approaches, as are efforts such as semantic MedLine that map ontological relationships in large volumes of text. Cathy closed by noting the importance of citizen engagement, e.g. in relation to biodiversity data (95% of the publications on biodiversity are from North America and Europe, while the species described are virtually all found in Africa and South America).

The session closed with a questions from Stuart Shieber who wondered how the Max Planck Society wants to support creating an environment that allows publishers to transition to Open Access, a hint that Bernard Schutz made. Bernard replied that there were ongoing conversations between publishers and the MPS on these issues.

This is my second report from the Berlin 9 Open Access Conference, this one summarizing Tuesday’s session on Open Access Policy. I’m still catching up on yesterday’s talks and will post those later today or early tomorrow.

The session was moderated by Alma Swan of Enabling Open Scholarship, also director of Key Perspectives Ltd. Alma introduced the panelists:

  • Bernard Rentier, Rector, Université de Liege
  • Stuart Shieber, Director, Office for Scholarly Communication, Harvard University
  • William Nixon, Digital Library Development Manager, University of Glasgow
  • Jeffrey Vitter, Provost and Executive Vice Chancellor, University of Kansas

After this, Alma laid out some of the key issues on which the presenters would focus in their talks, namely the precise wording of the institutional open access policy that they have put into place, the people involved in planning and implementing it, the nature of the implementation and finally the resources for ongoing support (as she pointed out, if there is no ongoing support, open acces does not work). Alma then proposed an elaborate typology of policies based on multiple factors, i.e. who retains rights, whether or not there is a waiver, when deposit takes place and whether or not there is an embargo on the full text or the article meta-data. I’m hoping to be able to include Alma’s slides here later; there is a very nice table in it that describes these points.

Bernard Rentier from the Université de Liege in Belgium was the first presenter and gave a very engaging talk. He started with the analogy that a university that doesn’t know what it is publishing is like a factory that doesn’t know what it’s producing. The initial motivation at Liege was to create an inventory of what was being published there. Scholars wanted to be able to extract lists of their publications easily and be more visible to search engines. Bernard went on to describe what he called the Liege approach of carrot and stick and summarized this by saying “If you don’t have a mandate, nothing happens. If you have a mandate and don’t enforce it, nothing happens.” Having a mandate to deposit articles, the enforcement of this mandate, the quality of service provided and the incentives and sanctions in place are all vital. Bernard then described ORBi, the university’s repository. ORBi has 68.000 records and 41.000 full texts (50%), all uploaded by the researchers themselves. Most of the papers which are not available in full text were published before 2002. Papers which have a record in the repository are cited twice as often as papers by Liege authors that do not have a record, something that Bernard attributed to their strongly improved findability. Not all full texts in ORBi are Open Access — roughly half of the texts are embargoed, waiting to be made available after the embargo has been raised. Bernard explained that 20% of what is published in ORBi constitutes what is often called grey literature (reports, unpublished manuscripts) which was now much more visible than before. He noted that ORBi had been marketed as being “not just another tool for librarians”, but rather that his goal had been to involve the entire faculty, something that was also furthered by making the report produced by ORBi the sole document relevant in all performance reports (e.g. for promotions, tenure). ORBi is linked to Liege’s digital university phone book, tying it to general identity information that people might search for. It is also mentioned aggresively on the university website rather than being hidden away on the pages of the library. Bernard closed by saying that today ORBi was attracting an impressive 1100 article downloads per day and that plans were underway to use the system at the Unversity of Luxembourg, the Czech Academy of Sciences and other institutions.

Stuart Shieber followed with a talk on the development of the Hardvard Open Access mandate, introduced in 2008. Since its original introduction, a total of eight Harvard schools have joined the agreement, which generally mandates use of the institutional repository for publications (there is a waiver). Stuart described that first preparations began in 2006 and that there was much discussion in the academic senate. The FAS faculty voted in February 2008 and unanimously accepted the new policy. Its structure was outlined by Stuart as follows:

  • permission (1): author grants the university rights
  • waiver (2): if you want a waiver, you get a waiver
  • deposit (3): mandate of deposit on publication, also everything is deposited including material under embargo

This creates a structure where authors retain a maximum of control over their publications, yet generally deposit what they publish in the university’s repository. Stuart closed by saying (in reference to Bernard) “We’re no trying to apply a stick, we’re trying to apply a carrot” (e.g. statistics for authors on their article use and other incentives).

Next up was William Nixon for the University of Glasgow who presented their repository, Enlighten, William started by saying that he wasn’t wild about the terms “mandate” and “repository”, but that they had sought to communicate the usefulness of Englighten to authors, winning them over rather than forcing them to use the service. He described the wide integration of Enlighten with other services and cited a statistic showing that 80% of traffic to the repository comes from Google. William then gave a historic account of their approach. After launching Enlighten in 2006 and “strongly encouraging” its use by authors, virtually nothing happenend. In 2007 a student thesis mandate was introduced, making it a requirement for all theses to be deposited. In 2008, all publications “where copyright permits” by the faculty were included. In 2010, the report generated by Enlighten was made a key element of the overall research assessment, an important step mirroring the strategy used in Liege. William also discussed staff concerns: What content must be provided? Am I breaking copyright law by using the repository? How and by whom will the publication be seen and accessed online? What version (repository vs. publisher) of my publication will be cited? Wiliam closed by giving a brief account of the repository’s performance record: 14.000 new records had been added in 2010 alone, a rapid growth.

The University of Kansas’ provost and vice chancellor, Jeffrey Vitter, gave a historical account how how KU Scholarworks, the university’s repository had been gradually developed and introduced and pointed to the importance of the advocacy of organizations such as ARL, who had promoted the idea behind IRs and Open Access for many years, making it easier to popularize the idea among the faculty. I apologize for not having an in-depth account of Jeffrey’s talk, but at this point jet lag caught up with me. If you have any notes to contribute for this or any other part of the session, please share.

In the Q&A that followed the presentations what stuck with me was Bernard Rentier’s response to the question of an Elsevier representative about whether collaboration with publishers was not paramount for the success of an open access policy. Bernard emphatically described the difficulties he had experienced in the past when negotiating with major publishers and made clear that while he was open to collaboration a sign of trust would be in order first.

This is my first post reporting from the Berlin 9 Open Access Conference taking place in Bethesda this week. I’ll be reporting and summarizing as thoroughly as I can starting with two pre-conference sessions that took place yesterday.

Note: I’ll include the presenters’ slides here if I can somehow get my hands on them. Stay tuned.

Christoph Bruch of the Max Planck Digital Library (MPDL) opened the first pre-conference session on Open Access Publishing by introducing the four presenters:

  • Neil Thakur, NIH (perspective of funders and government)
  • Peter Binfield, PLoS ONE (perspective of an OA publisher)
  • Pierre Mournier, Cléo/OpenEdition.org (alternate approach to gold/green OA)
  • Caroline Sutton, OAP Association & Co-Action Publishing (perspective of OA advocacy)

Neil Thakur started his talk by saying that he was not presenting official NIH policy, but rather a personal perspective. He pointed to the declining level of science funding in the US and that the response to this development could only be to work longer, work cheaper, or create value more efficiently, arguing that the emphasis should be on the last option. In Neil’s view this had also worked in the past: eletronic publications are faster to find and easier to distribute than ever in the history of scientific research. However more papers and more information don’t necessarily mean more knowledge. Knowledge is still costly, both because of paywalls, but also because of the time that has to be spent on finding relevant information and on integrating it into one’s own research. Neil went on by describing the difficulty and costliness of planning large collaborative projects and the need to increase productivity by letting scientists incorporate papers into their thinking faster. He lamented that many relevant answers to pressing scientific questions (e.g. regarding cancer or climate change) are “buried in papers” and cited natural language processing (NLP), data mining and visual search as techniques that could help to extract more relevant findings from papers. He set a simple but ambitious goal: in 10 years time, a scientist should be able to incorporate 30% more papers into their thinking than today. So what kind of access is required for such approaches? Full and unrestricted access is necessary for summarizing content and analyzing the full text, otherwise the computer can’t mine anything and the improvements in efficiency described fail to materialize. Neil made the excellent point that librarians are generally more concerned with how to disseminate scientific findings vs. funders and scientists who are interested in increasing scientific productivity. Libraries sometimes need to adjust to the notion that the university should ideally produce knowledge, and that knowledge takes on a variety of forms, not just that of peer-reviewed publications. Neil called this vision “all to all communication”, an approach that is ultimately about creating repositories of knowledge rather than repositories of papers. His characterization of “a machine as the first reader” of a paper really resonated with me for stressing the future importance of machine analysis of research results (something that of course applied to science much more than to the social sciences and humanities). Neil furher argued that fair use is a different goal than analysis by machine and that the huge variety of data formats and human access rights made machine reading challenging. Yet the papers that one doesn’t include in one’s research (e.g. because they aren’t accessible) may be those which are crucial to the analysis. Neil also put on the table different ways of measuring scientific impact and quickly concluded that what we currently have (Impact Factor) is insufficient, a criticism that seemed to resonate with the audience. Rather, new measurements should take into account productivity and public impact of a publication, rather than citations or downloads. Finally, Neil concluded by describing various problems caused by licenses that restricts the re-use of material. Re-use is, among other things, extremely important to companies who seek to build products on openly available research results. He ended by saying that “we’re funding science to make our economy stronger”, driving home the relevance of openness not just for access, but also for re-use.

Peter Binfield’s talk presented his employer (PLoS) and its success in developing a business model based on open access publishing. PLoS started modestly in 2000 and became an active publisher in 2003. Today it is one of the largest open access publishing houses in the world and the largest no-for-profit publisher based in the U.S. With headquarters in San Francisco, it has almost 120 employees. Peter noted that while PLoS’ old missions had been to “make scientific and medical literature freely available as a public resource” its new mission is to “accelerate progress in science and medicine by leading a transformation in research communication”, broadening its direction from providing access to publication to being an enabler of scientific knowledge generation in a variety of technological ways. Peter stressed that PLoS consciously uses the CC-BY license to allow for full re-use possibilites. He described the author fees model that is financially the publisher’s main source of income (though there is also some income from ads, donations and membership fees) and noted that PLoS’ article fees have not risen since 2009. Fee waivers are given on a regular basis, assuring that the financial situation of the author does not prevent him/her from publishing. PLoS Biology (founded in 2003) and PLoS Medicine (2004) are the house’s oldest and most traditionally organized journals. They follow the model of Nature or Science, with their own full-time editorial staff, unique front matter and a very small number of rigorously selected papers (about 10 per month). Peter noted that tradeoff of this approach is that while producing excellent scientific content it is also highly labor intensive and makes a loss as a result of this. The two journals were followed up by PLoS Genetics, PLoS Computational Biology, PLoS Pathogens, and PLoS Neglected Tropical Diseases, the so-called PLoS Community Journals, launched between 2005 and 2007. These publications are run by a part-time editorial board of academics working at universities and research institutes, rather than being PLoS employees. Only a relatively small administrative staff supports the community that edits, reviews and publishes submissions, which serves to increase the overall volume of publications. Finally, Peter spoke about PLoS ONE, a very important component of PLoS. While traditional journals have a scope of what is thematically suitable for publication in them, PLoS ONE’s only criteria is the validity of the scientific data and methods used. PLoS ONE publishes papers from a wide range of disciplines (life sciences, mathematics, computer science) asking only “is this work scientific?” rather than “is this work relevant to a specific readership?”. Discussions about relevance occur post-publication on the website, rather than pre-publication behind closed doors. Peter continued by stating that PLoS ONE seeks to “publish everything that is publishable” and that because of the great success of the service, PLoS had reached the point of being financially self-sustaining. By volume, PLoS ONE is now the largest “journal” in the world, an increase in growth that he also linked to the introduction of the Impact Factor (IF) to rank the journal, an important prerequisite for researchers in many countries (e.g. China) who are effectively banned from publishing in non-impact factor journals, something that Peter wryly called “the impact of the impact factors on scientists”. Peter gave the impressive statistic that in 2012, PLoS ONE will publish 1 in 60 of all science papers published worldwide and described a series of “clones”, i.e. journals following a similar concept launched by major commercial publishers. Houses such as Springer and SAGE have started platforms with specific thematic foci that otherwise closely resemble PLoS ONE. Finally, Peter spoke about PLoS’ new initiatives: PLoS Currents, a service for publishing below-article-length content (figures, tables etc) and focusing on rapid dissemination, PLoS Hubs, where post-review of Open Access content produced elsewhere is conducted and which aggregated and enriches openly available results, and PLoS Blogs, a blogging platform (currently 15 active bloggers) used mainly for science communication and to educate the public. Peter closed noting that the Impact Factor is a flawed metric due to being a journal-level measurement, rather than an article-level indicator. He described the wider, more holistic approach taken by PLoS by measuring downloads, usage stats from a variety of services and social media indicators.

Pierre Mournier from Cléo presented OpenEdition, a French Open Access platform focused on the Humanities and Social Sciences and based on a Freemium business model. Cléo, the center for electronic publishing is a joint venture of multiple organizations that employs roughly 30 people. It currently runs revues.org (a publishing platform that hosts more than 300 journals and books), calenda.org (a calender of currently over 16000 conference calls) and hypotheses.org (a scholarly blog platform with over 240 active bloggers). Pierre explained how Cléo re-examined the golden road open access model and found it to be problematic for their constituency. He regarded the problem of subsidy model (no fees have to be paid — the model favored in Brazil) as being very fragile, support can run out suddenly. On the other hand, author fees potentially restrict the growth of a platform and have no tradition in Humanities and Social Sciences, which may be a disincentive to authors. Pierre continued by asking what the role of libraries could be in the future. Cléo’s research highlighted that Open Access resources are used very scarcely via libraries libraries, why users searching at libraries use resources which are toll access (TA) more frequently. Open access interestingly enough appears to mean that researchers (who know where to look) access publications more freely, but students tend to stick to what is made available to them via libraries. Because libraries are the point of access to scientific information for students, they use toll access resources more, for which the library acts as a gatekeeper. Pierrre explained that the Freemium model they developed (also used by services like Zotero or Spotify) based on this observation combines free (libre) and premium (pay) features. Access to HTML is free with openedition, while PDF and epub formats are subscription-based and paid for by libraries. COUNTER statistics are also provided to subscribers. Pierre highlighted the different needs of different communities involved in the academic publication process and notes that the Freemium model gives libraries a vital role, allowing them to continue to act as gatekeepers to some features of otherwise open scholarly content. Currently 20 publishers are using OpenEdition, with 38 research libraries subscribing, and 1000 books available.

Caroline Sutton spoke about “open access at the tipping point”, i.e. recent developments in the Open Access market. OASPA consists of a number of publishers, commercial and non-profit, e.g. BioMed Central, Co-Action Publishing, Copernicus, Hindawi, Journal of Medical Internet Research, Medical Education Online, PLoS, SAGE Publications, SPARC Europe and Utrecht University Library Publishers. The initial activism of OASPA was about dispelling fears about Open Access (Is it peer-reviewed? Is it based on serious research?). Caroline listed factors showing that the broad perception of Open Access has changed over the past few years. The new characterization is that Open Access is about the grand challenges of our time and and important prerequisite for economic growth. The discussion is about the finer points of how OA fits into academic publishing, rather than whether or not it should exist at all. Caroline noted that beyond gold vs. green road, there is now more talk of mixing and combining the two approaches. She pointed to a huge growth in OA publications over the last 2-3 years and noted that “everybody is getting into the game” including commercial publishers such as Springer, SAGE and Wiley. So how necessary is an organization like OASPA if OA is so popular? As Caroline put it “now we can roll up our sleeves and do different things” (e.g. educate legacy publishers and scholarly societies who lack the resources to successfully implement OA). Another area of activity of OASPA is discussing what should count as an open access journal. Free access AND re-use are crucial according to Caroline, who noted that OASPA promotes the use of CC-BY across the board, although there are exceptions to this. It is now about making the point that re-use is interesting, about finding arguments that convince scholars and publishers of the advantages of data mining and aggreation sevices for which re-use is required. Licensing and technical standards are key in this respect. Caroline closed by noting the significance of DOAJ and the development of new payment systems for OA article charges which would make it easier for authors and publishers to utilize OA.

Tagged with:  

An interesting issue — especially to the library and information science community — that Google’s Max Senges raised at the Berlin Symposium on Internet and Society (#bsis11) was how the impact of the instiute’s research could be measured. HIIG’s mission is not just to produce excellent scholarship, but also to foster a meaningful dialog with a wide range of stakeholders beyond academia in relation to the issues that the institute investigates.

This approach has a number of implications that I want to briefly address. My views are my own, but I consider this an exciting test case for a modern, digital form of science evaluation. I believe three things can serve to make the institute’s research as transparent as possible:

  1. primary research results (i.e. papers) should be Open Access,
  2. journalistic contributions (essays, interviews, public speaking) beyond academic publications should be encouraged,
  3. communication of research via social media (blogs, Twitter) should be encouraged.

Open Access is of key importance

David Drummond emphasized the importance of Open Access in his speech at the Institute’s inauguration. A plausible step to make Open Access part of the institute’s culture could be to sign the Berlin Declaration and set up a dedicated repository of institute publications. HIIG could encourage its researchers to publish in gold road Open Access journals such as those listed in the DOAJ and encourage use of a green road approach par the SHERPA/Romeo list in the remaining cases. It could further encourage the use of Creative Commons or similar licences for scholarly publications.

Journalism and engagement with the general public

The public has a considerable interest in the issues investigated at HIIG and accordingly talking with and through traditional media channels will be of great importance. This should not merely be considered a form of marketing, but rather a form of dialog that will allow HIIG to fulfill its obligation to the public to act as an informed voice in civic debate around issues such as privacy and net neutrality. Engagement with the public via essays, interviews, public speaking and similar activities should be considered part of the institute members’ impact.

Social media’s role for science communication

The institute could consider social media as a central avenue of engaging with a wider public and recognize the willingness to use it accordingly. Scholarly blogging, for example, should be considered as part of a member’s research output instead of being regarded as a chiefly private enterprise. Social media activity cannot supplant traditional scholarly publishing, but it can serve to conduct conversations around research, get the attention of non-academics, and point to formal publications, among other things.

So how could this be implemented? The first and second points — making primary research results available and promoting journalistic contributions — are already standard practice elsewhere. The third is a little more tricky. Should it be important how many friends a researcher has on Facebook, or followers on Twitter (assuming he/she is even on these platforms)? Such an approach would be much too simplistic, but perhaps something a little more nuanced could be tried. How about encouraging the use of the #hiig (hash)tag wherever possible and continuously tracking the results? The institute could run its own blog — this may or may not work well, given that many contributors might already have their own one — or a blog planet, a site that just aggregates material from existing blogs that is #hiig-tagged.

These are just general ideas, but eventually they could coalesce into a framework for evaluating HIIG’s impact beyond purely scholarly (and faulty) forms of measurement such as the impact factor.

I’m currently relaxing at HIIG HQ, watching the staff make final preparations for the Institute’s formal inauguration, which will take place at 5pm today at Humboldt University’s Audimax (do drop by if you’re in the area, even if you haven’t been formally invited). I thought I’d share two statements on the launch of the Institute from Google, which were posted today and yesterday.

“Interaktion von Internet, Forschung und Gesellschaft verstehen” (in German)
David Drummond, VP Google, in German newspaper DIE ZEIT

Launching and Internet & Society Research Institute
Max Senges, Google Policy, Google European Public Policy Blog

Tagged with: