[R] Calculating Betweenness - Efficiency problem

Senthil Purushothaman spurushothaman at lnxresearch.com
Tue Jul 22 20:58:37 CEST 2008


Dear Gabor,
   Thank you very much for the insights. I have been using the igraph
package for my computations. But I did not know about
graph.data.frame(). Thanks again for that. So I did run my data using
the steps you had provided. Weirdly, even though the .csv file has
approximately 300,000 records (remember that the file gets truncated to
65536 rows when opened in Excel 2003), not all of them are pulled in
during the operation and the final betweenness list contains only ~1000+
records but it should be tens of thousands. 

I know that you are a busy person. This problem seems to be a very
different challenge. I am attaching the Test.csv file for your
experiments. Thank you very much again.

Best regards,
Senthil
(909) 267-0799

-----Original Message-----
From: Gabor Csardi [mailto:csardi at rmki.kfki.hu] 
Sent: Monday, July 21, 2008 1:57 AM
To: Senthil Purushothaman
Cc: jim holtman; r-help at r-project.org
Subject: Re: [R] Calculating Betweenness - Efficiency problem

Senthil,

you can try the 'igraph' package. Export your two-column Excel file
as a .csv, use 'read.csv' to read that into R, then 'graph.data.frame'
to create an igraph graph from it. Finally, call 'betweenness' on 
the graph. It is really just three/four lines, something like this:

tab <- read.csv(...)
g <- graph.data.frame(tab)
bet <- betweenness(g)
bet <- data.frame(city=V(g)$name, betweenness=bet)

The last line creates a two column data frame with the betweenness 
score of each city. 

Best,
Gabor

On Sat, Jul 19, 2008 at 02:59:07PM -0700, Senthil Purushothaman wrote:
> Hi Jim,
>     Thank you for the response. Your suggestion will help me avoid the
whole text to number conversion process that I perform using LookUp in
excel. I will definitely give it a shot. But it still doesn't address
the vector conversion since a graph file is drawn only using the
vectors. Assuming that I use 'factor' to convert the characters to
numbers, how do I convert these numbers into vectors?
> 
> Thanks,
> Senthil
> 
> 
> 
> 
> -----Original Message-----
> From: jim holtman [mailto:jholtman at gmail.com]
> Sent: Sat 7/19/2008 4:49 AM
> To: Senthil Purushothaman
> Cc: r-help at r-project.org
> Subject: Re: [R] Calculating Betweenness - Efficiency problem
>  
> It would seem that you can output the initial file from EXCEL, read it
> into R with 'read.csv' and then use 'factor' to convert the characters
> for City1 and City2 to the numbers that you want to use.  Have you
> tried this approach?
> 
> On Fri, Jul 18, 2008 at 3:51 PM, Senthil Purushothaman
> <spurushothaman at lnxresearch.com> wrote:
> > Hello,
> >
> > I am calculating 'Betweenness' of a large network using R.
Currently, I have the node-node information (City1-City2) in an excel
file, present in two columns where column A has City1 and column B has
City2 that city1 is connected to. These are the steps that I go through
to calculate betweenness of my network.
> >
> > a) Convert the City1-City2 (text) into Number1-Number2 in the excel
file where every unique city has a unique number.
> > b) Paste all the city-city information separated by comma into
c(...) in the R GUI to obtain the corresponding vectors. As you can
imagine this copy-paste operation takes a long time. Example:
c(1,3,1,5,2,4,2,5). Just fyi, I have a text file that contains all nodes
separated by comma based on the appropriate link information.
> > c) Then, I create a graph file with the above vector.
> > d) I use the graph file to calculate betweenness of my network.
> >
> > I am sure there must be a better, more efficient way to calculate
betweenness. Ideally, I would like to just have the City1 - City2 (link)
information in two columns in an excel file and calculate the
betweenness from that file directly.
> >
> > Please provide an optimal solution for this problem. I appreciate
your time and help.
> >
> > Thanks,
> > Senthil
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem you are trying to solve?
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor <csardi at rmki.kfki.hu>    UNIL DGM


More information about the R-help mailing list