# [R] Gamma Simulation from Exponential Distribution: find sample size given power and sig.Level

MacQueen, Don m@cqueen1 @end|ng |rom ||n|@gov
Mon Oct 29 21:00:04 CET 2018

```If this is homework, then r-help has a no homework policy. I'm assuming that if it is homework then the focus is on statistical concepts, not on R programming.

It looks like your gen_p_vals function should be defined as

gen_p_vals <- function(reps = n)

Why not just use the rgamma() function to simulate your gamma distributions?
Why not just use the t.test() function do calculate the t test?
But if you really want to calculate the t test manually, use the var() function in the denominator, instead of that x-mean(x) business.
It's unnecessary to use return().
Initializing Z.list outside the gamma.sim function does no good.
Are you using the variable "n" for different purposes in different places? If so, it's confusing.

Although it didn't appear to cause problems this time, sending html formatted email to r-help is generally not a good idea.

(and real names are preferred)

-Don

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509

﻿On 10/28/18, 1:26 PM, "R-help on behalf of Black Yellow" <r-help-bounces using r-project.org on behalf of blackish952 using gmail.com> wrote:

I built my function to simulate two gamma distributions X and Y based on the
simulation, I am interested in finding an equal sample size n for X and Y
such that n can be determined given 90% power and 5% significance level.

My thought process is that I create a sequence of n's value and use a for
loop to reiterate each n value in the sequence along with an "if" statement
saying that if given the n value, 95% of the time when I replicate the
process, the probability that I reject the null hypothesis given the
alternative is true is 90%.

The problem I am facing is that n value is always the maximum within my
n_seq. I am doubting there must be something wrong when I generate my two
Gamma distributions.

Any input is appreciated.

set.seed(3701)
n_seq = seq(from = 1, to = 200, by = 1)
Z.list = numeric(reps)
var.list = numeric(reps)

gamma.sim = function(n, mu){
for(j in 1:5e5){
u.list = runif(n = n)
x.list = -mu*log(1-u.list)
Z.list[j] = sum(x.list)
}
return(Z.list)
}

# Generate n generations for X.gamma and Y.gamma
gen_p_vals <- function(n = reps){
p_vec <- numeric(reps)
for(ii in 1:reps){
x <- gamma.sim(n = 79, mu = 1)
y <- gamma.sim(n = 80, mu = 1)
t_stat_numerator <- mean(x) - mean(y) - 0 # Test Statistic under
null hypothesis
s_p <- sqrt(sum((x-mean(x))^2) + sum((y-mean(y))^2)) /
(sqrt(2*reps-2))
t_stat <- t_stat_numerator /(s_p*(sqrt((1/reps)+(1/reps))))
p_vec[ii] <- 2* pt(-abs(t_stat), 2*reps-2)
}
return(p_vec)
}
power_pvals <- gen_p_vals(n = 10)
mean(power_pvals < 0.05)

for(j in 2:100){
power_pvals = gen_p_vals(n = n_seq[j])
if (mean(power_pvals < 0.05) == 0.90){
print(j)
}
}

[[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help