I’ve already shared this bit of personal news with a few friends and colleagues, but I thought I’d blog about it as well — especially since I’m woefully behind on my Iron Blogger schedule.
After a fairly long time in the making, I have been awarded a three-year research grant from the Deutsche Forschungsgemeinschaft (DFG) for the project Networking, visibility, information: a study of digital genres of scholarly communication and the motives of their users (summary in German on the DFG’s site). The project investigates new forms of scholarly communication (especially blogging and Twitter) and their role for academia. My key concerns are usage motives, i.e. why scholars use blogs and Twitter, and how these motives correspond with usage practices (how they blog and tweet), rather than how many researchers use these channels of communication or what makes them refrain from using them (see this blog post and the study mentioned in it for that kind of work). My main methods will be qualitative interviews with a sample of 20-25 blogging and/or tweeting academics, along with in-depth content analysis of the material they post in these channels over a prolonged period (>1 year). Identifying usage patterns and relating them to the participants’ narrative about their use will be another key objective. Ultimately, I hope to find a (tentative) answer to the question what role blogs and Twitter may play for the future of digital scholarship, and whether they will remain a niche phenomenon or become mainstream over time.
The project follows up on my work on corporate blogging and connects strongly to what we have been doing at the Junior Researchers Group “Science and the Internet” over the past year, but the focus on interviews should result in a more user-centric analysis. As someone who has been doing (applied) linguistic analysis to make inferences about social processes, I feel much more comfortable actually talking to the people I want to study, rather than just crunching numbers on how they tweet. Big data social science research is obviously and understandably en vogue these days, but I hope to find a good synergy between qualitative and quantitative approaches in my project.
My new institutional home for the next three years will be the Berlin School of Library and Information Science at Humboldt University. I’m grateful to Michael Seadle for supporting my project and really look forward to working with my new colleagues at IBI (that’s the German acronym, which, as far as I can tell, is preferred to its more entertaining English equivalent). I also look forward to working with colleagues from the Alexander von Humboldt Institute for Internet and Society (HIIG) where I’m currently supporting the project Regulation Watch. Finally, I plan to keep in close contact with the colleagues in Düsseldorf, both at the Junior Researchers Group and the Department of English Language and Linguistics, where I have learned virtually everything I know about being a researcher. I am especially indebted to Dieter Stein for his enduring support and for his contagious enthusiasm for all aspects of scholarship.
Sic itur ad astra!
For an overview of previous work I’ve done in this direction, have a look at my publications.
Those of you following my occasional updates here know that I have previously posted code for graphing Twitter friend/follower networks using R (post #1. post #2). Kai Heinrich was kind enough to send me some updated code for doing so using a newer version of the extremely useful twitteR package. His very crisp, yet thoroughly documented script is pasted below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | # Script for graphing Twitter friends/followers
# by Kai Heinrich (kai.heinrich@mailbox.tu-dresden.de)
# load the required packages
library("twitteR")
library("igraph")
# HINT: In order for the tkplot() function to work on mac you need to install
# the TCL/TK build for X11
# (get it here: http://cran.us.r-project.org/bin/macosx/tools/)
#
# Get User Information with twitteR function getUSer(),
# instead of using ur name you can do this with any other username as well
start<-getUser("YOUR_USERNAME")
# Get Friends and Follower names with first fetching IDs (getFollowerIDs(),getFriendIDs())
and then looking up the names (lookupUsers())
friends.object<-lookupUsers(start$getFriendIDs())
follower.object<-lookupUsers(start$getFollowerIDs())
# Retrieve the names of your friends and followers from the friend
# and follower objects. You can limit the number of friends and followers by adjusting the
# size of the selected data with [1:n], where n is the number of followers/friends
# that you want to visualize. If you do not put in the expression the maximum number of
# friends and/or followers will be visualized.
n<-20
friends <- sapply(friends.object[1:n],name)
followers <- sapply(followers.object[1:n],name)
# Create a data frame that relates friends and followers to you for expression in the graph
relations <- merge(data.frame(User='YOUR_NAME', Follower=friends),
data.frame(User=followers, Follower='YOUR_NAME'), all=T)
# Create graph from relations.
g <- graph.data.frame(relations, directed = T)
# Assign labels to the graph (=people's names)
V(g)$label <- V(g)$name
# Plot the graph using plot() or tkplot(). Remember the HINT at the
# beginning if you are using MAC OS/X
tkplot(g) |
Unfortunately I’m not able to attend the annual IPrA conference next week in Manchester and had to cancel the trip short notice. I was scheduled to give a talk as part of the session Quoting in Computer-mediated Communication on my work with Katrin Weller on retweeting among scientists.
Luckily for me, there will be a follow-up event of sorts (see below). I’ve posted the call here since it doesn’t seem to be available on the Web other than as a PDF. Submit something if you’re doing research on quoting! I’m fairly sure that the deadline will be extended by a week or two.
CfP: Quoting Now and Then – 3rd International Conference on Quotation and Meaning (ICQM)
University of Augsburg, Germany
19 April – 21 April 2012
Conference Convenors:
Wolfram Bublitz
Jenny Arendholz
Christian Hoffmann
Monika Kirner
Contact: Monika Kirner
E-mail: monika.kirner@phil.uni-augsburg.de
Call for Papers
This conference addresses the pragmatics of quoting as a metacommunicative act both in old (printed) and new (electronically mediated) communication. With the rapid evolution of new media in the last two decades, approaches to the study of (forms, functions and impact of) quoting have been gaining momentum in linguistics. Although quotations in print media have already been investigated to some extent, quoting in computer-mediated communication is still unchartered territory. This conference shall focus on the formal and functional evolution of quoting from old (analog) to new (digital) media. While the conference builds on the panel “Quoting in Computer-mediated Communication” to be presented in July 2011 at the International Conference of Pragmatics (IPrA), it assumes a much broader perspective, paying special tribute to the inherent confluence and complementarity of synchronic and diachronic approaches. Consequently, we invite papers from both (synchronic and diachronic) perspectives to report on the formal, functional as well as the pragmatic-discursive and multimodal nature of quoting in different genres or media.
Plenary talk: Jörg Meibauer
Abstracts:
Please submit an abstract of not more than 500 words (for a 30 min talk plus 10 min discussion) via e-mail to monika.kirner@phil.uni-augsburg.de
Deadline for abstracts:
1 July 2011
15 August 2011
Liebe Twitter-Nutzerin,
Lieber Twitter-Nutzer,
Ich bin Sprachwissenschaftler an der Universität Düsseldorf und beschäftige mich schwerpunktmäßig mit Internetkommunikation. Als Teil der Studie “Aspekte privater Twitter-Kommunikation” möchte die Nutzungsgewohnheiten von deutschsprachigen Twitter-Nutzern untersuchen, die Twitter nicht ausschließlich beruflich einsetzen (im Gegensatz zu z.B. Journalisten, Wissenschaftlern, Politikern, und anderen Menschen in Kommunikationsberufen). Zu diesem Zweck würde ich gerne deine öffentlichen Tweets einen Monat lang aufzeichnen und auswerten. Anschließend würde ich dir gerne per Mail einige Fragen (nicht mehr als 10) zu deiner Twitter-Nutzung stellen.
Es werden ausschließlich öffentliche Tweets (also keine DMs) aufgezeichnet. Sämtliche Daten werden anonymisiert (d.h. Namen — auch Twitter-Nicknames — entfernt) und nicht an Dritte weitergegeben. Einzelne Tweets können über das Hashtag #exclude jeder Zeit aus der Aufzeichnung ausgeschlossen werden. Am Ende des Untersuchungszeitraum schicke ich dir bei Interesse gerne ein Archiv deiner aufgezeichneten Tweets zu.
Neben deinem Beitrag zur wissenschaftlichen Forschung winkt auch eine (kleine) Aufwandsentschädigung: ich verlose am Ende des Untersuchungszeitraum unter den Teilnehmern einen Amazon-Gutschein im Wert von 50 Euro.
Wenn du zu einer Teilnahme bereit bist, schicke bitte eine kurze Mail an Cornelius.Puschmann@uni-duesseldorf.de (Edit: natürlich kannst du dich auch per Twitter melden). Falls du nicht teilnehmen möchtest, musst du nichts weiter tun. Fragen zur Studie beantworte ich gerne per Mail.
Schon jetzt vielen Dank für dein Interesse und deine Unterstützung!
Dr. Cornelius Puschmann
Nachwuchsforschergruppe “Wissenschaft und Internet”
Heinrich-Heine-Universität Düsseldorf
As part of the research we’re doing in Düsseldorf on the use of Twitter at academic conferences, here’s a poster we’re presenting in a few days at GOR ’11:
Here’s the citation for the poster:
Puschmann, C., Weller, K., & Dröge, E. (2011). Studying Twitter conversations as (dynamic) graphs: visualization and structural comparison. Presented at General Online Research, 14-16 March 2011, Düsseldorf, Germany. Retrieved from http://ynada.com/posters/gor11.pdf.
See this older post for more information on how to visualize dynamic graphs of retweets with Gephi.
I thought I’d write a brief update to this earlier post discussing the consequences of what has recently happened with Twitter’s TOS update/enforcement of the redistribution clause. Here is a concise summary from ReadWriteWeb:
[..] Twitter’s recent announcement that it was no longer granting whitelisting requests and that it would no longer allow redistribution of content will have huge consequences on scholars’ ability to conduct their research, as they will no longer have the ability to collect or export datasets for analysis.
Read this earlier RWW post for more background. Twitter has cracked down on services like TwapperKeeper and 140kit.com that allow users not only to track Twitter keywords and hashtags, but also to export and download archives of tweets in XML or CSV. Apparently Twitter wants to stop redistribution of “its” content to the extent possible, including redistribution for research purposes. From the RWW post:
140kit offered its Twitter datasets to other scholars for their own research. By no means a full or complete scraping of Twitter data, this information that the project had collected was still made available for download (for free) to researchers. But no longer.
The people at 140kit, to their credit, are working on an approach which would allow researchers to work with Twitter data without exporting data, but rather by using their interface. From 140kit’s website:
We have a solution, which will involve using a plugin based analytical approach, which will not allow you to export data, but will, with Twitter’s blessings, allow you to ask any questions to your dataset with ease.
Hmm, sorry, but I’m underwhelmed. There are already countless services out there that allow Twitter analysis in some form, often with nebulous results, because data collection and methods are not transparent. With any list of frequent terms on Twitter the question needs to be What stop words did you exclude? How clean is your data? I can’t know whether these things are done appropriately for my analysis unless I do them myself. You might object that not everyone is keen on sifting through CSV files with their own scripts. That’s true outside of academic research — for a casual analysis using a GUI tool for Twitter analysis might be okay — but for serious analysis direct access to the raw data itself is a must. And beyond just having access yourself, in the spirit of reproducible research it’s important to distribute the dataset along with your paper. That’s where we should be heading, rather than basing our analyses on pre-produced tools and mechanisms which handle the data in ways which are intransparent and beyond our control.
Will this shut off researcher’s access to Twitter data, as the RWW article claims? Not really, at least not everyone’s access. Those researchers who build their own tools (or deploy existing ones, such as yourTwapperKeeper, on their own servers) will have no trouble at all getting all the data they want. It’s just the rest — those who can’t code, or lack tech support (=funding) who will be restricted to simple GUI tools. If you’re a PhD student at a small university, in a department with no technical expertise or support, you have a competitive disadvantage. More power to computer scientists, and to centers like Berkman and the OII, this decision seems to say.
How to solve this problem? Luckily services like Amazon AWS level the playing field somewhat. Setting up and account there to scrape Twitter on a regular basis (for example with yourTwapperKeeper, or with your own set of scripts) is probably the best alternative to using a service like 140kit.
Note: Check out this video interview with John O’Brian of TwapperKeeper, who basically gives the same advice.
Update: I’ve written a follow-up to this post.
A few days ago, the people behind Twitter archival site TwapperKeeper.com announced that they will be discontinuing the export feature of the service on March 20, 2011. Apparently the feature is in violation of Twitter’s terms of service, at least in the form it’s currently implemented in TwapperKeeper.
Unfortunately this cuts off a number of academics who are investigating communication on Twitter for scientific purposes from a convenient data source. While it’s fairly easy to get data directly via the Twitter API (which is what TwapperKeeper was doing), I know many people who want to concentrate on the data itself, rather than running their own servers to scrape Twitter on a regular basis. What’s more is that Twitter’s attitude is worrisome: many of us have tried to get an exemption from API rate limits in the past, to no avail. Twitter doesn’t give researchers privileged access to their data, and now they’re crippling TwapperKeeper on top of that.
Bottom line: what will we use after March 20? Ideally, a replacement would provide the following:
- the hashtag/search query functionality of TwapperKeeper,
- the export functionality of TwapperKeeper,
- exclusive use for academic purposes (on the grounds that this might keep Twitter from shutting it down),
- stability and reliability,
- long-term viability.
The last point is important, because I don’t think it will be difficult to set up a server somewhere to suit the needs of a few people, but a larger-scale solution seems more sensible in the long run. Maybe JISC can do something like that, based on yourTwapperKeeper (which they supported)? Or one of the big institutes (OII, Berkman)? Either way it would be nice to find an alternative that doesn’t give those of us with devs and major IT support behind them a huge edge over the rest…

