[R] loop for comparing two or more groups using bootstrapping
Jim Lemon
drj|m|emon @end|ng |rom gm@||@com
Wed Sep 12 00:43:36 CEST 2018
Hi Kristi,
My fault, I only worked out how to assign the values to the names and pick
out the columns of daT for the calculations. I think this does what you
want, but I can't guarantee the result.
daT<-structure(list(year1=c(0.417,0.538,0.69,0.688,0.688,0.606,
0.667,0.7,0.545,0.462,0.711,0.642,0.744,0.604,0.612,
0.667,0.533,0.556,0.444,0.526,0.323,0.308,0.195,0.333,
0.323,0.256,0.345,0.205,0.286,0.706,0.7,0.6,0.571,0.364,
0.429,0.326,0.571,0.424,0.341,0.387,0.341,0.324,0.696,
0.696,0.583,0.556,0.645,0.435,0.471,0.556),year2=c(0.385,
0.552,0.645,0.516,0.629,0.595,0.72,0.638,0.557,0.588,
0.63,0.744,0.773,0.571,0.723,0.769,0.667,0.667,0.526,
0.476,0.294,0.323,0.222,0.556,0.263,0.37,0.357,0.25,
0.323,0.778,0.667,0.636,0.583,0.432,0.412,0.333,0.571,
0.39,0.4,0.452,0.326,0.471,0.7,0.75,0.615,0.462,0.556,
0.4,0.696,0.465),year3=c(0.435,0.759,0.759,0.759,0.714,
0.593,0.651,0.683,0.513,0.643,0.652,0.757,0.791,0.649,
0.78,0.5,0.5,0.5,0.533,0.429,0.333,0.286,0.231,0.533,
0.303,0.417,0.333,0.333,0.357,0.909,1,0.952,0.8,0.556,
0.529,0.562,0.762,0.513,0.733,0.611,0.733,0.647,0.909,
0.857,0.8,0.556,0.588,0.562,0.857,0.513),year4=c(0.333,
0.533,0.6,0.483,0.743,0.5,0.691,0.619,0.583,0.385,0.653,
0.762,0.844,0.64,0.667,0.571,0.571,0.615,0.421,0.5,0.205,
0.308,0.25,0.6,0.242,0.308,0.276,0.235,0.211,0.9,0.632,
0.72,0.727,0.356,0.5,0.368,0.5,0.41,0.562,0.514,0.4,
0.409,0.632,0.72,0.727,0.4,0.5,0.421,0.5,0.462)),.Names=c("year1",
"year2","year3","year4"),row.names=c(NA,-50L),class="data.frame")
colname.mat<-combn(paste0("year",1:4),2)
samplenames<-apply(colname.mat,2,paste,collapse="")
k<-10000
meandiff<-function(x) return(mean(x[[1]])-mean(x[[2]]))
for(column in 1:ncol(colname.mat)) {
assign(samplenames[column],
replicate(k,data.frame(sample(daT[,colname.mat[1,column]],3,TRUE),
sample(daT[,colname.mat[2,column]],3,TRUE))))
meandiffs<-unlist(apply(get(samplenames[column]),2,meandiff))
cat(samplenames[column],"\n")
cat("mean diff =",mean(meandiffs),"95% CI =",
quantile(meandiffs,c(0.025,0.975)),"\n")
png(paste0(samplenames[column],".png")
hist(meandiffs)
dev.off()
}
You should get a printout of the means and CIs and bunch of PNG files with
the histograms.
Jim
On Tue, Sep 11, 2018 at 11:55 PM Kristi Glover <kristi.glover using hotmail.com>
wrote:
> Dear Jim,
>
> Thank you very much for the code. I run it but it gave me row names
> like "year224", "year142".
>
> are these the difference between columns? If we want to get bootstrapping
> means of difference between years (year2-year1; year3-year1), its CI and
> exact p value, how can we get it?
>
> thanks
>
> KG
>
> ----
>
> head(daT)
>
> colname.mat<-combn(paste0("year",1:4),2)
>
> samplenames<-apply(colname.mat,2,paste,collapse="")
>
> k<-10
>
> for(column in 1:ncol(colname.mat)) {
>
> assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,
> column]]),3,TRUE)))
>
> }
>
> > get(samplenames[1])
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> year224 0.556 0.667 0.571 0.526 0.629 0.696 0.323 0.526 0.256 0.667
> year142 0.324 0.324 0.706 0.638 0.600 0.294 0.612 0.688 0.432 0.387
> year237 0.571 0.696 0.629 0.471 0.462 0.471 0.452 0.595 0.333 0.435
>
>
>
>
> ------------------------------
> *From:* Jim Lemon <drjimlemon using gmail.com>
> *Sent:* September 11, 2018 1:44 AM
> *To:* Kristi Glover
> *Cc:* r-help mailing list
> *Subject:* Re: [R] loop for comparing two or more groups using
> bootstrapping
>
> Hi Kristy,
> Try this:
>
> colname.mat<-combn(paste0("year",1:4),2)
> samplenames<-apply(colname.mat,2,paste,collapse="")
> k<-10000
> for(column in 1:ncol(colname.mat)) {
>
> assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))
> }
>
> Then use get(samplenames[1]) and so on to access the values.
>
> Jim
> On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <kristi.glover using hotmail.com>
> wrote:
> >
> > Hi R users,
> >
> > I was trying to test a null hypothesis of difference between two groups
> was 0. I have many years data, such as year1, year2,, year3, year4 and I
> was trying to compare between year1 and year2, year1 and year3, year2 and
> year3 and so on and have used following code with an example data.
> >
> >
> > I tried to make a loop but did not work to compare between many years,
> and also want to obtain the exact p value. Would you mind to help me to
> make a loop?
> >
> > Thanks for your help.
> >
> >
> > KG
> >
> >
> > daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,
> > 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,
> > 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,
> > 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,
> > 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,
> > 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,
> > 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,
> > 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,
> > 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,
> > 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,
> > 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,
> > 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,
> > 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,
> > 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,
> > 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,
> > 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,
> > 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,
> > 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,
> > 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,
> > 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,
> > 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,
> > 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names =
> c("year1",
> > "year2", "year3", "year4"), row.names = c(NA, -50L), class =
> "data.frame")
> >
> > head(daT)
> >
> > # null hypothesis; difference is equal to zero
> >
> > dif1.2<-daT$year2-daT$year1
> >
> > k=10000
> >
> > mysamples1.2=replicate(k, sample(dif1.2, replace=T))
> >
> > mymeans1.2=apply(mysamples1.2, 2, mean)
> >
> > quantile(mymeans1.2, c(0.025, 0.975))
> >
> > hist(mysamples1.2)
> >
> > mean(mymeans1.2)
> >
> > #what is p value?
> >
> >
> > #similarly Now I want to compare between year 1 and year3,
> >
> > dif1.3<-daT$year3-daT$year1
> >
> > mysamples1.3=replicate(k, sample(dif1.3, replace=T))
> >
> > mymeans1.3=apply(mysamples1.3, 2, mean)
> >
> > quantile(mymeans1.3, c(0.025, 0.975))
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> R-help -- Main R Mailing List: Primary help - Homepage - SfS
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> stat.ethz.ch
> The main R mailing list, for announcements about the development of R and
> the availability of new code, questions and answers about problems and
> solutions using R, enhancements and patches to the source code and
> documentation of R, comparison and compatibility with S and S-plus, and for
> the posting of nice examples and benchmarks.
>
>
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list