[R] Help with increasing the speed of script

a217 ajn21 at case.edu
Fri Oct 28 04:59:33 CEST 2011


I actually have two questions regarding the same script:

#################################################
data <- vector('list', 24)
splc <- vector('list',24)
df.summ <- vector('list',24)

for (i in 1:length(chrData)) 
{
  data[[i]] <- read.table(file=paste('chr',i,'.nonCG.covered.out',sep=''),
header=F)
  colnames(chrData[[i]])<-c("chr","start","end","tot","methy")

  splc[[i]]<-split(data[[i]], paste(data[[i]]$chr, data[[i]]$start,
data[[i]]$end))
    
  df.summ[[i]]<-as.data.frame(t(sapply(splc[[i]], function(x)
summary(1-x$methylation))))

  cat("finished reading",paste('chr',i),date(),"\n")
}
#################################################

1) I'll start off first with perhaps the easier question. The above script
is a portion of r code that I run from batch through a pipeline. I've
decided that any process that takes longer than 5 minutes to complete should
include some measure of progress so that the user doesn't just sit there and
wonder if the code is working or not.

So I've tried the command: 

R CMD BATCH -q test.R /dev/tty

which gives the same output as .Rout file would. What I am hoping to
accomplish is to only print the "cat()" results from batch instead of the
entire script.

2) The second issue I have is with the speed of the script. Specifically,
the slow point of the script is:

splc[[i]]<-split(data[[i]], paste(data[[i]]$chr, data[[i]]$start,
data[[i]]$end))

It may operate in quadratic or exponential time (I haven't specifically run
tests) because as the input data increases in size, the time the script
takes longer than a constant-time script would run.

Perhaps the more experienced programmers among you could give me a few
hints/suggestions so that I can head in the right direction.

--
View this message in context: http://r.789695.n4.nabble.com/Help-with-increasing-the-speed-of-script-tp3946731p3946731.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list