[R] Very slow read.table on Linux, compared to Win2000 [Broad cast]

Alberto Murta amurta at ipimar.pt
Thu Jun 29 01:43:27 CEST 2006


I have a pentium 4 pc with 256 MB of RAM, so I made a text file, tab 
separated, with column names and 15000 x 483 integers:

> system("ls -l teste.txt")
-rw-r--r--  1 amurta  amurta  16998702 Jun 28 16:08 teste.txt

the time it took to import it was around 15 secs:

> system.time(teste <- read.delim("teste.txt"))
[1] 15.349  0.244 16.597  0.000  0.000

so I think lack of RAM must not be the main problem.
Cheers

Alberto


> version
               _
platform       i386-unknown-freebsd6.1
arch           i386
os             freebsd6.1
system         i386, freebsd6.1
status
major          2
minor          3.1
year           2006
month          06
day            01
svn rev        38247
language       R
version.string Version 2.3.1 (2006-06-01)



On Wednesday 28 June 2006 05:43, Liaw, Andy wrote:
> From: Peter Dalgaard
>
> > <davidek at zla-ryba.cz> writes:
> > > Dear all,
> > >
> > > I read.table a 17MB tabulator separated table with 483
> > > variables(mostly numeric) and 15000 observations into R.
> >
> > This takes a
> >
> > > few seconds with R 2.3.1 on windows 2000, but it takes
> >
> > several minutes
> >
> > > on my Linux machine. The linux machine is Ubuntu 6.06, 256 MR RAM,
> > > Athlon 1600 processor. The windows hardware is better
> >
> > (Pentium 4, 512 RAM), but it shouldn't make such a difference.
> >
> > > The strange thing is that even doing something with the data(say a
> > > histogram of a variable, or transforming integers into a factor)
> > > takes really long time on the linux box and the computer
> >
> > seems to work
> >
> > > extensively with the hard disk.
> > > Could this be caused by swapping ? Can I increase the
> >
> > memory allocated to R somehow ?
> >
> > > I have checked the manual, but the memory options allowed for linux
> > > don't seem to help me (I may be doing it wrong, though ...)
> > >
> > > The code I run:
> > >
> > > TBO <-
> >
> > read.table(file="TBO.dat",sep="\t",header=TRUE,dec=",");   #
> > this takes forever
> >
> > > TBO$sexe<-factor(TBO$sexe,labels=c("man","vrouw"));   #
> >
> > even this takes like 30 seconds, compared
> >
> > > to nothing on Win2000
> > >
> > > I'd be grateful for any suggestions,
> >
> > Almost surely, the fix is to insert more RAM chips. 256 MB
> > leaves you very little space for actual work these days, and
>
> Try running Windows on the 256MB box and you'll see why Peter recommended
> the above.  Consider yourself lucky that R actually still does something
> useful under Unbuntu with so little RAM.  If adding more RAM is not an
> option, perhaps not running X altogether would help.
>
> Andy
>
> > a 17MB file will get expanded to several times the original
> > size during reading and data manipulations. Using a
> > lightweight window manager can help, but you usually regret
> > the switch for other reasons.
> >
> >
> > --
> >    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
> >   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
> >  (*) \(*) -- University of Copenhagen   Denmark          Ph:
> > (+45) 35327918
> > ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX:
> > (+45) 35327907
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

-- 
  Alberto G. Murta
IPIMAR - Institute of Fisheries and Marine Research 
Avenida de Brasilia; 1449-006 Lisboa; Portugal
Tel: +351 213027120; Fax: +351 213015948



More information about the R-help mailing list