[R] hurdle, simulated power

David Atkins datkins at u.washington.edu
Wed May 4 17:13:30 CEST 2011


Hi all--

We are planning an intervention study for adolescent alcohol use, and I 
am planning to use simulations based on a hurdle model (using the 
hurdle() function in package pscl) for sample size estimation.

The simulation code and power code are below -- note that at the moment 
the "power" code is just returning the coefficients, as something isn't 
working quite right.

The average estimates from code below are:

count_(Intercept)         count_trt  zero_(Intercept)
       2.498327128      -0.000321315       0.910293501
          zero_trt
      -0.200134813

Three of the four look right (ie, converging to population values), but 
the count_trt is stuck at zero, regardless of sample size (when it 
should be ~ -0.20).

Does anyone see what's wrong?

Thanks for any input.

cheers, Dave



mysim <- function(n, beta0, beta1, alpha0, alpha1, theta){
	trt <- c(rep(0,n), rep(1,n))
	### mean function logit model
	p0 <- exp(alpha0 + alpha1*trt)/(1 + exp(alpha0 + alpha1*trt))
	### 0 / 1 based on p0
	y1 <- as.numeric(runif(n)>p0)
	### mean function count portion
	mu <- exp(beta0 + beta1*trt)
	### estimate counts using NB dist
	require(MASS, quietly = TRUE)
	y2 <- rnegbin(n, mu = mu, theta = theta)
	### if y2 = 0, draw new value
	while(sum(y2==0)>0){
		y2[which(y2==0)] <- rnegbin(length(which(y2==0)), mu=mu[which(y2==0)], 
theta = theta)
	}
	y<-y1*y2
	data.frame(trt=trt,y=y)
}
#alpha0, alpha1 is the parameter for zero part
#beta0,beta1 is the parameter for negative binomial
#theta is dispersion parameter for negative binomial, infinity 
correspond to poisson
#

#example power analysis
#return three power, power1 for zero part, power2 for negative binomial part
#power3 for joint test,significance level can be set, default is 0.05
#M is simulation time
#require pscl package
#library(pscl)

mypower <- function(n, beta0, beta1, alpha0, alpha1, theta, 
siglevel=0.05, M=1000){
	myfun <- function(n,beta0,beta1,alpha0,alpha1,theta,siglevel){
		data <- mysim(n,beta0,beta1,alpha0,alpha1,theta)
		require(pscl, quietly = TRUE)
		res <- hurdle(y ~ trt, data = data, dist = "negbin", trace = FALSE)
		est <- coef(res)#[c(2,4)]
		#v<-res$vcov[c(2,4),c(2,4)]
		#power1<-as.numeric(2*pnorm(-abs(est)[2]/sqrt(v[2,2]))<siglevel)
		#power2<-as.numeric(2*pnorm(-abs(est)[1]/sqrt(v[1,1]))<siglevel)
		#power3<-as.numeric((1-pchisq(t(est)%*%solve(v)%*%est,df=2))<siglevel)
		#c(power1,power2,power3)
	}
	r <- replicate(M, myfun(n,beta0,beta1,alpha0,alpha1,theta,siglevel), 
simplify=TRUE)
	apply(r, 1, mean)
}

out <- mypower(n = 1000, beta0 = 2.5, beta1 = -0.20,
						 alpha0 = -0.90, alpha1 = 0.20,
						 theta = 2.2, M = 100)
out


-- 
Dave Atkins, PhD
Research Associate Professor
Department of Psychiatry and Behavioral Science
University of Washington
datkins at u.washington.edu

Center for the Study of Health and Risk Behaviors (CSHRB)		
1100 NE 45th Street, Suite 300 	
Seattle, WA  98105 	
206-616-3879 	
http://depts.washington.edu/cshrb/
(Mon-Wed)	

Center for Healthcare Improvement, for Addictions, Mental Illness,
   Medically Vulnerable Populations (CHAMMP)
325 9th Avenue, 2HH-15
Box 359911
Seattle, WA 98104
http://www.chammp.org
(Thurs)



More information about the R-help mailing list