[R] boostrap astronomy problem

Mon Jan 2 18:29:01 CET 2006

Hi,

I am an astronomer and somewhat new to boostrap statistics.  I understand
the basic idea of bootstrap resampling, but am uncertain if it would be
useful in my case or not.  My problem consists of maximizing a likelihood
function based on the velocities of a number of stars.  My assumed
distribution of velocities of these stars is:
---------------------------------------------------------------------------
dmy=function(x,v,k,t)(k+1)/(v-t)^(k+1)*(v-x)^k
---------------------------------------------------------------------------
where x would be my stellar velocities.  (Essentially it is a beta
distribution.)
---------------------------------------------------------------------------
My likelihood function looks something like this:

lm<-function(x){
	e<-x[1]
	k<-x[2]
	log(e) - n*log(k+1) + (k+1)*n*log(e-t)-k*sum(log(e-vg))
}
--------------------------------------------------------------------------
The quantities n and t are known, and vg is my velocity data.  I am
minimizing this function using the function "optim" (BFGS option) to find
k and e that minimize this.  Also, my data set is small, only about 50
stars.  Therefore, I was thinking that I could use the boot function to
resample my data and solve the minimization for each resample.  This way I
believe I'll get better estimates for standard errors and confidence
intervals.  Is it safe to assume that the distributions for k and e are
approximately Normal, therefore making the bootstrap useful?  I have
actually used the boot function with this set up:
--------------------------------------------------------------------------
mystat=function(s,b){
	#Negative Log Likelihood Function
	lm<-function(x){
   		e<-x[1]
   		k<-x[2]
   		log(e) - n*log(k+1) + (k+1)*n*log(e-t) -
			k*sum(log(e-s[b]))
	}

	#Gradient of Negative Likelihood Function
	glm=function(x){
   		e<-x[1]
   		k<-x[2]
   		c(1/e + (k+1)*(n/(e-t)) - k*sum(1/(e-s[b])),-n/(k+1) +
			n*log(e-t) - sum(log(e-s[b])))
	}

optim(c(480.,2.),lm,glm,method="BFGS",control=list(maxit=10000000))$par
}

#Compute Bootstrap replicates of escape velocity and kr
m2B2=boot(vg,mystat,5000)
------------------------------------------------------------------------
Does this appear to be correct for what I'd like to achieve?  I have
looked at the distribution and it appears to be about Normal, but can I
say that this is true for the sampling distribution as well?  Also, the
bootstrap distribution is fairly biased, should I be using "bca" or tilted
bootstrap confidence intervals?  If so, I am having some trouble getting
the tilted bootstrap to work.  Specifically, it is having trouble finding
"multipliers".

Also, should I be in some way taking into account my velocity distribution
when resampling?  Any suggestions would be very helpful, thanks.

Thank you for your time.

Greg Ruchti

--
Gregory Ruchti
Bloomberg Center for Physics and Astronomy
Johns Hopkins University
3400 N. Charles St.
Baltimore, MD 21218-1216

gruchti at pha.jhu.edu
Tel: (410)516-8520