[R] p-val issue for ranked two-group test

Joshua Wiley jwiley.psych at gmail.com
Thu Oct 20 23:48:28 CEST 2011


Hi,

It looks like you are trying to manually bootstrap.  Take a look at:

require(boot)
?boot

as an added advantage of using boot instead of trying to do it
manually, you can easily parallelize.  In fact, if you are using one
of the pre-release versions of 2.14.0, the new parallel package is
included by default and you do not even have to go venturing out into
the wide world of CRAN to look.  That said, there are several aspects
of your code that could be readily vectorized.  More specific details
supplied if a less homework-like example is provided.

Cheers,

Josh

On Thu, Oct 20, 2011 at 10:17 AM, Laurel Klein Serieys
<laurelklein at ucla.edu> wrote:
> Hi-
> I'm wondering if anyone can help me with my code.  I'm coming up dry when I
> try to get a p-value from the following code.  If I make a histogram of my
> resampled distribution, I find the difference between by groups to be
> significant.  I've ranked the data since I have outliers in one of my
> groups.
>
> mange= c(35,  60,  81, 158, 89, 130,  90,  38, 119, 137,  52,  30,  27, 115,
> 123,  31, 124,  91)
>
> healthy= c(46, 50, 30, 58, 32, 42, 42, 33, 19, 42, 30, 26, 38, 23, 16, 28,
> 42, 42, 33, 35, 51, 31, 39, 40 , 42, 38, 36, 39, 38)
>
> l.mange<-length(mange)
> l.healthy<-length(healthy)
>
> exptdiff <- mean.mange - mean.healthy #the expected difference between
> between the mean of the ranked groups
>
>
> both.chemistry<-c(mange, healthy) #concatenate two vectors into one in
> preparation for resampling the data
>
>
> both.ranks<-rank(both.chemistry) #rank combined data in the case that there
> are outlying values in the data or the dataset is small
>
> reps=1000
>
>
> z<-rep(NA,reps) # z will the the simulated storage value for the resampling
> efforts
>
> for(i in 1:reps){ #create the loop
>
> x<- sample(both.ranks, length(both.ranks),replace=FALSE) #instructions for
> how to resample where sample the entire combined data without replacment
>
> p.mange<-mean(x[(1:l.mange)])  #create a simulate mean value for the
> resampled mange values
> p.healthy<-mean(x[(l.mange+1):(l.mange+l.healthy)])  #create a simulated
> mean value for the resampled healthy values
>
> pdiff<- p.mange-p.healthy #the simulated difference between groups
>
> z[i]<- pdiff  #the stored list of simulated differences
> }
> p=mean(z>=exptdiff)*2 #2-tailed test multiply by two
> p
>
> hist(z, xlab="Resample Values", main="Distribution for Two-Group BUN Test")
> confints=quantile(z, c(0.025,0.975))
> abline(v=confints, col="blue") #draw a line for each cutoffs
> abline(v=exptdiff, col="red")
>
> Thanks!
> L.Serieys
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/



More information about the R-help mailing list