[R] different results form summarization by loop and sum or rowMeans function

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Sep 11 16:24:46 CEST 2008


On Thu, 11 Sep 2008, Markus Schmidberger wrote:

> Hi,
>
> I found different results calculating the rowMeans by the function rowMeans() 
> and a simple for-loop. The differences are very low. But after this

Indeed, but the C code (rowMeans) is likely to be more accurate as it uses 
an extended-precision accumulator.

> calculation I will start some optimization algorithms (BFGS or CG) and there 
> I get huge differences (from the small changes in the beginning or start 
> values, I changed nothing else in the code).
> How I can avoid these differences between sum-loops and sum-functions?

You cannot. What you can do is work on making what you do with these 
inputs numerically stable: unless you do so your end results will have 
very little value.  (For example, are you finding different local minima, 
in which case you need to decide how to treat that possibility?)

I suggest reading an introductory book on Numerical Analysis, or

Monahan, J. F. (2001) Numerical Methods of Statistics. Cambridge: 
Cambridge. Chapter 2.

or

Press,W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. 
(2007) Numerical Recipes. The Art of Scientific Programming. Third 
Edition. Cambridge. Section 1.1 (I think).

> Attached a small testcode using data form Bioconductor.
>
> Best
> Markus
>
>
> library(affy)
> data(affybatch.example)
> mat <- exprs(affybatch.example)[1:100,1:3]
> mat <- exp(1)*mat
> mat <- asinh(mat)
>
> rowM1<- rowMeans(mat)
>
> t=rep(0,100) # Vektor mit 0en
> for(i in 1:100){
>  for(j in 1:3)
>      t[i] <- t[i] + mat[i,j]
> }
> rowM2 <- t/3
>
> m1 <- mat - rowM1
> m2 <- mat -rowM2
>
> print(m1-m2)
>
> sessionInfo()
> R version 2.7.1 (2008-06-23)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
>
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods [8] 
> base 
> other attached packages:
> [1] affy_1.18.2          preprocessCore_1.2.0 affyio_1.8.0       [4] 
> Biobase_2.0.1 
> -- 
> Dipl.-Tech. Math. Markus Schmidberger
>
> Ludwig-Maximilians-Universität München
> IBE - Institut für medizinische Informationsverarbeitung,
> Biometrie und Epidemiologie
> Marchioninistr. 15, D-81377 Muenchen
> URL: http://ibe.web.med.uni-muenchen.de Mail: Markus.Schmidberger [at] 
> ibe.med.uni-muenchen.de
> Tel: +49 (089) 7095 - 4599
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list