[R] for loop performance

Philipp Pagel p.pagel at wzw.tum.de
Thu Apr 14 10:31:37 CEST 2011


> I am running some simulations in R involving reading in several
> hundred datasets, performing some statistics and outputting those
> statistics to file. I have noticed that it seems that the time it
> takes to process of a dataset (or, say, a set of 100 datasets) seems
> to take longer as the simulation progresses.

Reading data, e.g. with read.table can be slow because it does a fair
bit of checking content, guessing data types etc. So I guess the
question is: how is your data stored (files, in what format,
database) and how do you read it into R? 

Once we know this there may be tricks to speed up the data import.

> I am curious to know if this has to do with how R processes
> code in loops or if it might be due to memory usage issues (e.g.,
> repeatedly reading data into the same matrix).

Probalby not - I would guess it's the parsing of the input data that
is slow.

cu
	Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/



More information about the R-help mailing list