Aug 05

I’ve plotted several word association graphs for this New York Times article (1st paragraph) using R and the igraph library.

#1, random method

text-igraph-random

#2, circle method

text-igraph-circle

#3, sphere method

text-igraph-sphere

#4, spring method

text-igraph-spring

#5, fruchterman-reingold method

text-igraph-fruchterman-reingold

# 6, kamada-kawai method

text-igraph-kamada-kawai

#7, graphopt method

text-igraph-graphopt

The red vertices mark cliques. Here’s the (rough) R code for plotting such graphs:

rm(list=ls());

library("igraph");
library("Cairo");

# read parameters
print("Text-as-Graph for R 0.1");
print("------------------------------------");

print("Path (no trailing slash): ");
datafolder <- scan(file="", what="char");

print("Text file: ");
datafile <- scan(file="", what="char");

txt <- scan(paste(datafolder, datafile, sep="/"), what="char", sep="\n", encoding="UTF-8");

print("Width/Height (e.g. 1024x768): ");
res <- scan(file="", what="char");
rwidth <- unlist(strsplit(res, "x"))[1]
rheight <- unlist(strsplit(res, "x"))[2]

words <- unlist(strsplit(gsub("[[:punct:]]", " ", tolower(txt)), "[[:space:]]+"));

g.start <- 1;

g.end <- length(words) - 1;

assocs <- matrix(nrow=g.end, ncol=2)

for (i in g.start:g.end)
{
assocs[i,1] <- words[i];
assocs[i,2] <- words[i+1];
print(paste("Pass #", i, " of ", g.end, ". ", "Node word is ", toupper(words[i]), ".", sep=""));
}

print("Build graph from data frame...");
g.assocs <- graph.data.frame(assocs, directed=F);

print("Label vertices...");
V(g.assocs)$label <- V(g.assocs)$name;

print("Associate colors...");
V(g.assocs)$color <- "Gray";

print("Find cliques...");
V(g.assocs)[unlist(largest.cliques(g.assocs))]$color <- "Red";

print("Plotting random graph...");
CairoPNG(paste(datafolder, "/", "text-igraph-random.png", sep=""), width=as.numeric(rwidth), height=as.numeric(rheight));
plot(g.assocs, layout=layout.random, vertex.size=4, vertex.label.dist=0);
dev.off();

print("Plotting circle graph...");
CairoPNG(paste(datafolder, "/", "text-igraph-circle.png", sep=""), width=as.numeric(rwidth), height=as.numeric(rheight));
plot(g.assocs, layout=layout.circle, vertex.size=4, vertex.label.dist=0);
dev.off();

print("Plotting sphere graph...");
CairoPNG(paste(datafolder, "/", "text-igraph-sphere.png", sep=""), width=as.numeric(rwidth), height=as.numeric(rheight));
plot(g.assocs, layout=layout.sphere, vertex.size=4, vertex.label.dist=0);
dev.off();

print("Plotting spring graph...");
CairoPNG(paste(datafolder, "/", "text-igraph-spring.png", sep=""), width=as.numeric(rwidth), height=as.numeric(rheight));
plot(g.assocs, layout=layout.spring, vertex.size=4, vertex.label.dist=0);
dev.off();

print("Plotting fruchterman-reingold graph...");
CairoPNG(paste(datafolder, "/", "text-igraph-fruchterman-reingold.png", sep=""), width=as.numeric(rwidth), height=as.numeric(rheight));
plot(g.assocs, layout=layout.fruchterman.reingold, vertex.size=4, vertex.label.dist=0);
dev.off();

print("Plotting kamada-kawai graph...");
CairoPNG(paste(datafolder, "/", "text-igraph-kamada-kawai.png", sep=""), width=as.numeric(rwidth), height=as.numeric(rheight));
plot(g.assocs, layout=layout.kamada.kawai, vertex.size=4, vertex.label.dist=0);
dev.off();

#CairoPNG(paste(datafolder, "/", "text-igraph-reingold-tilford.png", sep=""), width=as.numeric(rwidth), height=as.numeric(rheight));
#plot(g.assocs, layout=layout.reingold.tilford, vertex.size=4, vertex.label.dist=0);
#dev.off();

print("Plotting graphopt graph...");
CairoPNG(paste(datafolder, "/", "text-igraph-graphopt.png", sep=""), width=as.numeric(rwidth), height=as.numeric(rheight));
plot(g.assocs, layout=layout.graphopt, vertex.size=4, vertex.label.dist=0);
dev.off();

print("Done!");

Tagged with:
May 26

I thought I’d post an updated version of the simple stats on Twitter activity presented here. The data in the older post was collected before THATcamp took place, the graphs below show the activity during and after the camp.

The tweets I’ve collected are also available here (my own file) and on TwapperKeeper.

Tweets over time (roughly 14th of May to 24th)

Most active users

Most @-messaged users

Most retweeted users

Tagged with:
May 24

I’ve data-mined the #thatcamp hashtag a bit more and extracted all 230 links that were tweeted recently (also includes some of THATCamp Paris). Enjoy :-)

(or go here to view the table inside Google Docs)

Tagged with:
Jan 27

Here’s a list of URLs and hashtags that were popular among the @scientwists community last week. I realize that this is just a long enumeration, but I’m planning to publish these stats in a more concise format in the near future.

January 18th
http://phylogenomics.blogspot.com/2010/01/top-11-things-i-learned-at-science.html
http://deepseanews.com/2010/01/miriam-joins-us-at-dsn/
http://www.guardian.co.uk/science/2010/jan/18/running-brain-memory-cell-growth
#scio10
#Biotechnology
#hcsm

January 19th
http://trueslant.com/ryansager/2010/01/18/science-reporting-gone-wild/
http://timesonline.typepad.com/science/2010/01/science-on-the-bbc.html
http://friendfeed.com/brembs/177a01db/bertrand-russell-on-god-1959
#scio10
#Biotechnology
#ten23

January 20th
http://www.shortyawards.com/
http://www.ustream.tv/channel/nada-importa
http://friendfeed.com/jcbradley/0a46ac22/science-online-2010-thoughts
#scio10
#health
#technology

January 21st
http://www.popsci.com/science/article/2010-01/five-reasons-henrietta-lacks-most-important-woman-medical-history
http://phylogenomics.blogspot.com/2010/01/enough-w-good-here-are-top10-problems-w.html
http://www.newscientist.com/article/dn18423-viruses-use-hive-intelligence-to-focus-their-attack.html
#scio10
#technology
#ten23

January 22nd
http://fc07.deviantart.net/fs19/f/2007/248/a/f/dna_strand_corset_32_piercings_by_mizuzinkaholik.jpg
http://friendfeed.com/danielmietchen/cbfc448b/collaborative-futures-3-mike-linksvayer
http://scienceblogs.com/bookoftrogool/2010/01/scientists_why_your_access_to.php
#scio10
#corporateeyesontheprize
#technology

January 23rd
http://www.badscience.net/2010/01/12-monkeys-no-8-wait-sorry-i-meant-14/
http://www.ustream.tv/channel/aw8
http://friendfeed.com/pansapiens/212fde9c/you-know-your-research-is-original-when
#scio10
#3wordsconservativeshate
#FF

January 24th
http://www.shortyawards.com/
http://featuresblogs.chicagotribune.com/printers-row/2010/01/eureka-great-discoveries-in-new-science-books.html
http://friendfeed.com/science-2-0/3124a7c3/looking-for-help-on-building-list-of-social-web
#3wordsconservativeshate
#retailpolitics
#scio10

January 25th
http://iambiotech.org/2010/01/25/biotech-roundup-monday-january-25th/?utm_source=hootsuite&utm_medium=tweet&utm_content=roundup&utm_campaign=hootsuite
http://friendfeed.com/mfenner/04c40a1a/scientists-and-librarians-friend-or-foe
http://blogs.telegraph.co.uk/technology/markchangizi/100004573/do-ant-colonies-have-something-in-common-with-the-human-body/
#scio10
#hcsm
#science

Tagged with:
preload preload preload