[R] Very slow read.table on Linux, compared to Win2000 [Broad cast]

Liaw, Andy andy_liaw at merck.com
Wed Jun 28 14:43:02 CEST 2006


From: Peter Dalgaard
> 
> <davidek at zla-ryba.cz> writes:
> 
> > Dear all,
> > 
> > I read.table a 17MB tabulator separated table with 483 
> > variables(mostly numeric) and 15000 observations into R. 
> This takes a 
> > few seconds with R 2.3.1 on windows 2000, but it takes 
> several minutes 
> > on my Linux machine. The linux machine is Ubuntu 6.06, 256 MR RAM, 
> > Athlon 1600 processor. The windows hardware is better 
> (Pentium 4, 512 RAM), but it shouldn't make such a difference.
> > 
> > The strange thing is that even doing something with the data(say a 
> > histogram of a variable, or transforming integers into a factor)  
> > takes really long time on the linux box and the computer 
> seems to work 
> > extensively with the hard disk.
> > Could this be caused by swapping ? Can I increase the 
> memory allocated to R somehow ?
> > I have checked the manual, but the memory options allowed for linux 
> > don't seem to help me (I may be doing it wrong, though ...)
> > 
> > The code I run:
> > 
> > TBO <- 
> read.table(file="TBO.dat",sep="\t",header=TRUE,dec=",");   # 
> this takes forever
> > TBO$sexe<-factor(TBO$sexe,labels=c("man","vrouw"));   # 
> even this takes like 30 seconds, compared
> > to nothing on Win2000
> > 
> > I'd be grateful for any suggestions,
> 
> Almost surely, the fix is to insert more RAM chips. 256 MB 
> leaves you very little space for actual work these days, and 

Try running Windows on the 256MB box and you'll see why Peter recommended
the above.  Consider yourself lucky that R actually still does something
useful under Unbuntu with so little RAM.  If adding more RAM is not an
option, perhaps not running X altogether would help.

Andy

> a 17MB file will get expanded to several times the original 
> size during reading and data manipulations. Using a 
> lightweight window manager can help, but you usually regret 
> the switch for other reasons. 
> 
> 
> -- 
>    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark          Ph:  
> (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: 
> (+45) 35327907
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>



More information about the R-help mailing list