[R] help with randomisation test...

Thu Jul 23 18:47:38 CEST 2009

I am not sure that I fully understand what all you want to do (and I don't understand why you need the correlation and if a correlation based on 3 pairs is even meaningful), but here is a first stab at what you are trying to do:

tmp <- "Species Control_CR Damage_DR
A 10 2
A 9 3
A 7 4
A 9 2
A 8 3
B 5 4
B 6 3
B 4 2
B 5 4
B 6 3
C 8 1
C 6 2
C 4 4
C 7 1
C 6 2"

plants <- read.table( textConnection(tmp), header=TRUE )

plants2 <- rbind( data.frame( Species=plants$Species, growth=plants[[2]], group=1 ), 
	data.frame(Species=plants$Species, growth=plants[[3]], group=2 ) )

tmpfun <- function() {
	group2 <- ave( plants2$group, plants2$Species, FUN=sample )
	cr <- tapply( plants2[ group2==1, 'growth'], plants2[group2==1, 'Species'],
				FUN=mean)
	dr <- tapply( plants2[ group2==2, 'growth'], plants2[group2==2, 'Species'],
				FUN=mean)

	ir <- cr - dr
	cor( ir, cr )
}

out <- replicate( 1000, tmpfun() )

hist(out)

Is this what you wanted? Or at least helpful for going in the right direction?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Anne Kempel
> Sent: Thursday, July 23, 2009 9:02 AM
> To: r-help at r-project.org
> Subject: [R] help with randomisation test...
> 
> Dear R-people,
> I hope asking this is not too cheeky, but I do have a R Problem. I hope
> that some of you like to play around with R and can help me.
> 
> Its like this. I have several plant species (A,B,C) and 10 replicates
> per species. 5 plants per species are damaged, 5 not. I let a
> caterpillar feed on each plant and measured the growth of the
> caterpillars on control plants (CR) and on damaged plants (DR). The
> difference, DR-CR is an indicator for induced resistance IR.
> 
> The data could look like this...just as an example.
> 
> Species	Control_CR	Damage_DR
> A	10	2
> A	9	3
> A	7	4
> A	9	2
> A	8	3
> B	5	4
> B	6	3
> B	4	2
> B	5	4
> B	6	3
> C	8	1
> C	6	2
> C	4	4
> C	7	1
> C	6	2
> 
> 
> Now, i want to see if there is a trade-off (or a negative correlation)
> between CR and IR, but because IR is calculated out of CR there is a
> danger of spurious correlation. What I have to do to get rid of it is
> the following:
> 
> 1)Draw with replacement sets of 5 plants per treatment for each species
> 
> 2)Calculate of those sets the means (in the control and the damage
> treatment) and then abstract the mean of the damage minus the mean of
> the control to get a mean for IR.
> 
> 3)Randomize this IR (now called IRrandom)among all the species.
> 4)Go back to the data, calculate:  Each of the 5 CR values minus this
> IRrandom to get 5 new DR, then draw again with replacements sets of 5
> plants of each treatment for each species and calculate new means for
> the CR and the new DR. Now you have a mean CR and a new mean DR - you
> can calculate DR-CR to get a new IR (IRnew).
> 
> 5)Correlate this CR with the new IR. And store the result.
> 
> 6)Repeat step 1-5 many times. And make a histogram of the resulting
> corelation coefficients.
> 
> 
> You see, this is quite complicated. Does any of you have any idea how
> to
> start? I am very very grateful for any suggestions!
> The helper will get a parcel with swiss chocolate ;-)  !!
> 
> Best wishes from Bern,
> Anne
> 
> 
> --
> Anne Kempel
> Institute of Plant Sciences
> University of Bern
> Altenbergrain 21
> CH-3103 Bern
> kempel at ips.unibe.ch
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.