[R] how to improve the precison of this calculation?

Mike Marchywka marchywka at hotmail.com
Mon Feb 14 13:39:03 CET 2011




[ anyone know how to get hotmail to consistently mark original text? 
Now its hanging my keyboard in firefox  LOL ]

Anyway, I think I was the one advocating these approaches over things like
indefinite length calculations and I punted the R questions to others but 
I'm not real sure what you are asking here. It sounds like you are trying
to add a small number to a larger number and concerned that nothing is changed.

Normal floating point has some minimum number, I can't remember what it is
usually called, I'll use "d", such that 1+d==1. So if you keep adding
things that are too small they will all get dropped. There are probably
some numerical people here but just offhand you could try ordering your
terms, say you have a series of binomial coefficient b(i) and you
can start summing from the smallest one first, that should preserve important
sums. That is, if you want to add "d" to "1" say "n" times, do this d*n+1
rather than repeatedly adding 1+d+d+d+... executed from left to right. 
It may be easy to order your terms without evaluating them and express
each b(i+1)=a(i+1)*b(i) or do other tricks and find ways to avoid the issues
as much as possible. 


ok, what I called "d" above they call either "X" or epsilon ,



I was hoping to find something on intel site since they have lots
of numerical optimization stuff but if it is there it is buried in everything
else. 

If you actually look at hard core floating point code, you often
find comments about " if your compiler believes that addition and multiplication
commute or ... " whatever, it becomes obvious that pepole run into these problems
a lot. It is a nice way to turn an IIR audio filter into a tone generator for
a fire alarm LOL, you don't need huge numbers to see these issues come up. 
AFAIK, most issues with implementing linear algebra are related to these etc. 



----------------------------------------
Date: Sat, 12 Feb 2011 15:31:22 +0800
From: zhaoxing731 at yahoo.com.cn
To: R-help at stat.math.ethz.ch
Subject: [R] how to improve the precison of this calculation?


Hello
        T
        I want to order some calculation "result", there will be lots of "result" that need to calculate and order
        PS: the "result" is just a intermediate varible and ordering them is the very aim

# problem:
# For fixed NT and CT, and some pair (c,n). order the pair by corresponding result
# c and n are both random variable

CT<-6000        #assignment to CT
NT<-29535210    #assignment to NT

# formulae for calculation intermediate variable "result", this is just a picture of the calculation which can't implement due to  the precision problem

i<-0:(c-1)
vec<-choose(n,i)*choose(NT-n,CT-i)      #get the vector which need summation
result<-log(sum(vec))                   #get the log of summation

# thanks to Petr, we have a solution
calsta <- function(c, n) {
i <- seq(from=0, length=c)
logx <- lchoose(NT-n, CT-i) + lchoose(n, i)
logmax <- max(logx)
logmax + log(sum(exp(logx - logmax)))
}

# now, new problem arise, in theory, the "result" of different (c,n) pair should most probably differ, so I can order them, but
> calsta(918,100000)-calsta(718,100000)
[1] 0
> calsta(718,90000)-calsta(718,90000)
[1] 0
> calsta(718,90000)-calsta(918,100000)
[1] 0
# As Eik pointed out in another post, "calsta constantly returns 57003.6 for c>38 (the summands in
# sum(exp(logx - logmax)) will become 0 for c>38)" (when n= 10,000)


I think there are two ways to solve this problem:

1.Reducing the calcuation formulae algebraically get a conciser kernel for comparing. I think this is the perfect method and I failed many times, though this is not the topic of mailing-list, I hope if someone with stronge algebraical skills could give me some surprise

2.Through skills of R, which is difficult for me

PS: I don't want a perfect solution but a better one, which could have a higher discriminability than before

Thank you in advance

Yours sincerely

ZhaoXing
Department of Health Statistics
West China School of Public Health
Sichuan University
No.17 Section 3, South Renmin Road
Chengdu, Sichuan 610041
P.R.China

__________________________________________________
¸Ï¿ì×¢²áÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä?


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
 		 	   		  


More information about the R-help mailing list