Edit: I’ve posted an updated version of the script here. It is not quite as compressed as Anatol’s version, but I think it’s a decent compromise between readability and efficiency. :-)

Edit #2 And yet another update, this one contributed by Kai Heinrich.

I hacked together some code for R last night to visualize a Twitter graph (=who you are following and who is following you) that I briefly showed at the session on visualizing text today at THATCamp and that I wanted to share. My comments in the code are very basic and there is much to improve, but in the spirit of “release early, release often”, I think it’s better to get it out there right away.

Ingredients:

Note that packages are most easily installed with the install.packages() function inside of R, so R is really the only thing you need to download initially.

Code:

# Load twitteR package
library(twitteR)

# Load igraph package
library(igraph)


# Set up friends and followers as vectors. This, along with some stuff below, is not really necessary, but the result of my relative inability to deal with the twitter user object in an elegant way. I'm hopeful that I will figure out a way of shortening this in the future

friends <- as.character()
followers <- as.character()

# Start an Twitter session. Note that the user through whom the session is started doesn't have to be the one that your search for in the next step. I'm using myself (coffee001) in the code below, but you could authenticate with your username and then search for somebody else.

sess <- initSession('coffee001', 'mypassword')

# Retrieve a maximum of 500 friends for user 'coffee001'.

friends.object <- userFriends('coffee001', n=500, sess)

# Retrieve a maximum of 500 followers for 'coffee001'. Note that retrieving many/all of your followers will create a very busy graph, so if you are experimenting it's better to start with a small number of people (I used 25 for the graph below).

followers.object <- userFollowers('coffee001', n=500, sess)

# This code is necessary at the moment, but only because I don't know how to slice just the "name" field for friends and followers from the list of user objects that twitteR retrieves. I am 100% sure there is an alternative to looping over the objects, I just haven't found it yet. Let me know if you do...

for (i in 1:length(friends.object))
{
friends <- c(friends, friends.object[[i]]@name);
}


for (i in 1:length(followers.object))
{
followers <- c(followers, followers.object[[i]]@name);
}


# Create data frames that relate friends and followers to the user you search for and merge them.

relations.1 <- data.frame(User='Cornelius', Follower=friends)
relations.2 <- data.frame(User=followers, Follower='Cornelius')
relations <- merge(relations.1, relations.2, all=T)

# Create graph from relations.

g <- graph.data.frame(relations, directed = T)

# Assign labels to the graph (=people's names)

V(g)$label <- V(g)$name

# Plot the graph.

plot(g)

For the screenshot below I've used the tkplot() method instead of plot(), which allows you to move around and highlight elements interactively with the mouse after plotting them. The graph only shows 20 people in order to keep the complexity manageable.

Tagged with:  

10 Responses to Code and brief instruction for graphing Twitter with R

  1. Dan says:

    Ah! That’s one damn ugly syntax!

  2. A.S. says:

    “…I don’t know how to slice just the “name” field for friends and followers from the list of user objects that twitteR retrieves…”

    Piece of cake:

    sapply(friends.object,name)

    This shoul return a vector with just the names.

  3. cornelius says:

    Thanks, Anatol! I haven’t worked a whole lot with objects yet, since most of what I’ve done so far worked fine without them. I don’t find R terribly well-documented, at least not in the sense of providing a lot of examples (but perhaps I am looking in the wrong places)…

    @Dan: do you mean my code or R in general? My impression of R: easy syntax, fairly terrible vocabulary (what kind of function is ‘V()’?) Very powerful and great once you know it, but learning is complicated by a relative lack of newbie-friendly resources.

  4. A.S. says:

    I think your code is ok, although you can skip a few steps here and there. For example, there is no reason to create the variables relations.1 and relations.2 if all you want to do is merge them. Instead of this:

    relations.1 <- data.frame(User='Cornelius', Follower=friends)
    relations.2 <- data.frame(User=followers, Follower='Cornelius')
    relations <- merge(relations.1, relations.2, all=T)

    you could do this:

    relations <- merge(data.frame(User='Cornelius', Follower=friends), data.frame(User=followers, Follower='Cornelius'), all=T)

    In fact, since you are only going to use the variable "relations" once, you might as well not bother creating it at all and skip straight to the next step, which should then look like this:

    g <- graph.data.frame(merge(data.frame(User='Cornelius', Follower=friends), data.frame(User=followers, Follower='Cornelius'), all=T), directed = T)

    And since you only use the variable "g" once, why not skip this step as well and do this:

    plot(graph.data.frame(merge(data.frame(User='Cornelius', Follower=friends), data.frame(User=followers, Follower='Cornelius'), all=T), directed = T))

    Of course, we already know that you don't need to create the variables "friends" and "followers", since sapply() will do.

    So this would also work:

    plot(graph.data.frame(merge(data.frame(User='Cornelius', Follower=sapply(friends.object,name)), data.frame(User=sapply(followers.object,name), Follower='Cornelius'), all=T), directed = T))

    And of course, there is no real reason to create the variables "friends.object" and "followers.object", since you can use sapply() directly with the user object, so this would also work:

    plot(graph.data.frame(merge(data.frame(User='Cornelius', Follower=sapply(userFriends('coffee001', n=500, sess),name)), data.frame(User=sapply(useFollowers('coffee001', n=500, sess),name), Follower='Cornelius'), all=T), directed = T))

    This is almost your entire script in a more compact form (the only parts of your script that are still missing are the initiation of the session, which should remain separate, and the bit about V(g)$label <- V(g)$name — I'm not familiar with V(), but I'm sure you could also integrate it). Now you could wrap the whole thing into a function, and you're done.

    NOW do you see why R is beautiful?

  5. Gabor says:

    I think that V() is explained reasonably well:
    http://igraph.sourceforge.net/doc/R/iterators.html
    http://igraph.sourceforge.net/igraphbook/igraphbook-iterators.html

    If you come from math, then V is quite an obvious name for all vertices of a graph, just like E is an obvious name for the edges. V() and E() are coming from the G=(V,E) graph notation.

  6. [...] Aliás, é muiito fácil fazer esse gráfico no R. Quem quiser replicar o gráfico, basta seguir as dicas desse blog. [...]

  7. [...] an updated version of my script from last month, something I’ve been meaning to do for a while. I thank to Anatol Stefanovic and Gábor [...]

  8. [...] terms of mapping twitter networks with R, I found this post for mapping one’s own personal network of connections. For other visualizations, I found a [...]

  9. [...] know that I have previously posted code for graphing your Twitter friend/follower network using R (post #1. post #2). Last week Kai Heinrich was kind enough to send me some updated code for doing so using a [...]

  10. Bhupendrasinh Thakre says:

    Hi,

    Unfortunately in my system it is saying that “initSession not found”.
    Is i am missing something.

    Best,

    BT

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>