[R] How to compute p-Values

David Winsemius dwinsemius at comcast.net
Thu Jan 15 00:06:22 CET 2009


On Jan 14, 2009, at 4:27 PM, Andreas Klein wrote:

> Ok... I set up the problem by some code as requested:
>
> Example:
>
> x <- rnorm(100)
>
> mean_x <- mean(x)
>
> mean_boot <- numeric(1000)
>
> for (i in 1:1000) {
>
>  mean_boot[i] <- mean(sample(x,100,replace=TRUE))
>
> }
>
>
> How can I compute the p-Value out of mean_boot for the following  
> tests:

You should realize that you are conflating p-values and Neyman-Pearson  
hypothesis-testing formalism. It's perfectly possible that your  
textbook is also doing the same sort of conflation.
>
>
> 1. H0: mean_x = 0 vs. H1: mean_x != 0
>
> 2. H0: mean_x >= 0 vs. H1: mean_x < 0
>
>
>
> Is there a possibility to construct such p-Values or did I get  
> something wrong?

>
> Someone told, that the p-Value = 2 * min(sum(mean_boot>=0)/1000,  
> sum(mean_boot<0)/1000) for the first (two sided) test is, but didn't  
> get the idea behind it. Maybe someone can explain it, if it is the  
> solution to the problem.

That does not sound correct. For one thing, no mention is made of  
comparing the mean of a particular sample (of the same size as the  
bootstrap samples) against the distribution of bootstrap means. For  
another thing, you want to know if the sample mean either is less than  
the 0.025 quantile of the boot_mean distribution or is greater than  
the 0.975 quantile. Perhaps your informant meant to construct the test  
as 2* min( sum(mean_boot[i] < mean_x)/1000, sum(mean_boot[i] > mean_x)/ 
1000).

In the HA: mean_x <0 directed one sided case, you are only interested  
in whether the mean is below the 0.05 quantile. For the HA: mean_x >0  
you want to know if mean_x is above the 0.95 quantile

In my realization of the mean_boot I get:

quantile(mean_boot, probs = c(0.025, 0.05, 0.95, 0.975)  )
       2.5%         5%        95%      97.5%
-0.2057778 -0.1643545  0.1562328  0.1825198

Those are going to be your critical points for alpha=0.05 Neyman- 
Pearson tests of the sorts 1 and 2. The outer numbers are for the two- 
sided alternative.

For calculation of the p-values, I still think you need ecdf and  
probably Hmisc:::inverseFuntion as well. For p-values you need to go  
from the observed value back to the proportion that exceed that value.


>
>
> Regards,
> Andreas.
>
>
>
> --- David Winsemius <dwinsemius at comcast.net> schrieb am Mi, 14.1.2009:
>
>> Von: David Winsemius <dwinsemius at comcast.net>
>> Betreff: Re: [R] How to compute p-Values
>> An: klein82517 at yahoo.de
>> CC: "r help" <r-help at r-project.org>
>> Datum: Mittwoch, 14. Januar 2009, 16:40
>> I think we are at the stage where it is your responsibility
>> to provide some code to set up the problem.
>>
>> --David Winsemius
>> On Jan 14, 2009, at 9:23 AM, Andreas Klein wrote:
>>
>>> Hello.
>>>
>>> What I wanted was:
>>>
>>> I have a sample of 100 relizations of a random
>> variable and I want a p-Value for the hypothesis, that the
>> the mean of the sample equals zero (H0) or not (H1). That is
>> for a two sampled test.
>>> The same question holds for a one sided version, where
>> I want to know if the mean is bigger than zero (H0) or
>> smaller or equal than zero (H1).
>>>
>>> Therfore I draw a bootstrap sample with replacement
>> from the original sample and compute the mean of that
>> bootstrap sample. I repeat this 1000 times and obtain 1000
>> means.
>>>
>>> Now: How can I compute the p-Value for an one sided
>> and two sided test like described above?
>>>
>>>
>>>
>>> Regards,
>>> Andreas
>>>
>>>
>>> --- gregor rolshausen
>> <gregor.rolshausen at biologie.uni-freiburg.de> schrieb
>> am Mi, 14.1.2009:
>>>
>>>> Von: gregor rolshausen
>> <gregor.rolshausen at biologie.uni-freiburg.de>
>>>> Betreff: Re: [R] How to compute p-Values
>>>> An: "r help"
>> <r-help at r-project.org>
>>>> Datum: Mittwoch, 14. Januar 2009, 11:31
>>>> Andreas Klein wrote:
>>>>> Hello.
>>>>>
>>>>>
>>>>> How can I compute the Bootstrap p-Value for a
>> one- and
>>>> two sided test, when I have a bootstrap sample of
>> a
>>>> statistic of 1000 for example?
>>>>>
>>>>> My hypothesis are for example:
>>>>>
>>>>> 1. Two-Sided: H0: mean=0 vs. H1: mean!=0
>>>>> 2. One Sided: H0: mean>=0 vs. H1: mean<0
>>>>>
>>>>>
>>>> hi,
>>>> do you want to test your original t.test against
>> t.tests of
>>>> bootstrapped samples from you data?
>>>>
>>>> if so, you can just write a function creating a
>> vector with
>>>> the statistics (t) of the single t.tests (in your
>> case 1000
>>>> t.tests each with a bootstrapped sample of your
>> original
>>>> data -> 1000 simulated t-values).
>>>> you extract them by:
>>>>
>>>>> tvalue=t.test(a~factor)$statistic
>>>>
>>>> then just calculate the proportion of t-values
>> from you
>>>> bootstrapped tests that are bigger than your
>> original
>>>> t-value.
>>>>
>>>>>
>> p=sum(simualted_tvalue>original_tvalue)/1000
>>>>
>>>>
>>>> (or did I get the question wrong?)
>>>>
>>>> cheers,
>>>> gregor
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained,
>>>> reproducible code.
>>>
>>>
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained,
>> reproducible code.
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list