# [R] Retrieve hypergeometric results in large scale

William Dunlap wdunlap at tibco.com
Mon Oct 1 23:24:46 CEST 2012

```order() is usually a lot more useful than sort(), since, as you noticed,
sort() drops information about where each element in its output came
from.

think is similar.
> n <- 10 ; p <- 0.7 ; k <- 0:n ; d <- dbinom(k, n, p)
> plot(k, d) # density of binomial over its domain
If you want the indices of the largest density values whose
cumulative sum is less than 0.95 you
> ord <- order(d, decreasing=TRUE) # indices such that d[ord] is in decreasing order
> big <- ord[cumsum(d[ord]) < 0.95]
> data.frame(big, d=d[big], cumsum=cumsum(d[big]))
big         d    cumsum
1   8 0.2668279 0.2668279
2   9 0.2334744 0.5003024
3   7 0.2001209 0.7004233
4  10 0.1210608 0.8214841
5   6 0.1029193 0.9244035
> points(cex=2, k[big], d[big])

If you want to include the index of the density value that puts
you just over 0.95 first find the complement of the desired indices
and use setdiff to compute its complement.  E.g.,
> ord <- order(d)
> little <- ord[cumsum(d[ord]) < 0.05]
> big <- setdiff(seq_along(d), little) # difference of two sets of numbers
> big
  5  6  7  8  9 10

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of jas4710
> Sent: Monday, October 01, 2012 9:59 AM
> To: r-help at r-project.org
> Subject: Re: [R] Retrieve hypergeometric results in large scale
>
> Thanks Jeff~~~
>
> In fact I do not know how to combine and extract vectors in R.
>
> ans<-sort(dhyper(x, m, n, k),decreasing=TRUE)
> rbind(ans,cumsum(ans)
>
> will show the first point that exceeds 95% threshold. The problem is:
> *information is lost*
> I can no longer identify where are the first few elements from. e.g. for 10
> numbers, maybe they are from 4,5,6,7 or for 100 numbers, from 45 to 68
>
> So to append ID's to the data for later retrieval? rbind appears to do the
> job but not so exactly...
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Retrieve-95-coverage-of-
> results-from-a-hypergeometric-distribution-tp4644683p4644715.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help