[R] Fitting a distribution to peaks in histogram

Petr Pikal petr.pikal at precheza.cz
Wed Jul 19 14:54:51 CEST 2006


Hi

There are some packages for mass spectra processing (spectrino, 
caMassClass). I did not use them so I do not know how they suit your 
needs.

However you can compute area (integrate) by these functions

# uses information interactively from plot(x,y)
# first it replots data between corners *replot(x,y)*
# then it computes sum between x axis and y values - osum -
# and between "baseline" and y values - cista - based
# on locator positions

integ<-function (x,y)
{
replot(x,y)
meze<-locator(2)
dm<-meze$x[1]
hm<-meze$x[2]
abline(v=c(dm,hm),col=2)
vyber<-x<=hm&x>=dm
f3 <- splinefun(x, y)
osum<-integrate(f3, dm, hm)$value
o1<-(y[x==min(x[vyber])]+y[x==max(x[vyber])])*(max(x[vyber])-
min(x[vyber]))/2
cista<-osum-o1
return(c(osum,cista))
}

# similar as integ but you has to supply upper and lower limits (dm, 
# hm) manually if you do not want to perform "integration" of whole # 
area under the curve.


integ1<-function (x,y,dm=-Inf,hm=+Inf)
{
ifelse(dm==-Inf, dm<-min(x), dm<-dm)
ifelse(hm==+Inf, hm<-max(x), hm<-hm)
vyber<-x<=hm&x>=dm
f3 <- splinefun(x, y)
osum<-integrate(f3, dm, hm)$value
o1<-(y[x==min(x[vyber])]+y[x==max(x[vyber])])*(max(x[vyber])-
min(x[vyber]))/2
cista<-osum-o1
return(c(osum,cista))
}



On 19 Jul 2006 at 11:58, Ulrik Stervbo wrote:

Date sent:      	Wed, 19 Jul 2006 11:58:38 +0200
From:           	"Ulrik Stervbo" <ulriks at ruc.dk>
To:             	r-help at stat.math.ethz.ch
Subject:        	[R] Fitting a distribution to peaks in histogram

> Hello list!
> 
> I would like to fit a distribution to each of the peaks in a
> histogram, such as this:
> http://photos1.blogger.com/blogger/7029/2724/1600/DU145-Bax3-Bcl-xL.pn
> g .
> 
> The peaks are identified using Petr Pikal peaks function (
> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html), but after
> that I am quite stuck.
> 
> Any idea as to how I can:
> Fit a distribution to each peak
> Integrate the area between each two peaks, using the means and widths
> of the distributions fitted to the two peaks. I will be using the
> integrate function
> 
> The histogram is based on approximately 15000 events, which makes
> Mclust and pam (which both delivers the information I need) less
> useful.
> 
> The whole point of this exercise is to find the percentage of cells in
> peak 1, 2, 3, and so on, and between peak 1-2, peak 2-3, peak 3-4 and
> so on. Having more that 6 peaks does not appears likely.
> 
> I am quite new to R and apologise if the solution is fairly basic.
> 
> Thank you in advance for any help and suggestions
> 
> Sincerely,
> Ulrik
> 
>  [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

Petr Pikal
petr.pikal at precheza.cz



More information about the R-help mailing list