[R] how to...

Tyler Smith tyler.smith at mail.mcgill.ca
Sat Mar 17 14:05:45 CET 2007

On 2007-03-16, casot at libero.it <casot at libero.it> wrote:
> for example:
> I have got these data, organized in a dataframe. 
> 		sample1	sample2	sample3	sample4	group
> replicate1	1.00	0.02	0.35	0.50	A
> replicate2	1.00	0.02	1.54	1.11	A
> replicate3	1.00	0.02	1.54	1.11	A
> replicate4	1.00	0.02	1.54	1.11	A
> replicate5	1.00	0.10	0.18	0.72	B
> replicate6	1000.00	0.75	0.86	7.26	B
> replicate7	1000.00	0.75	0.18	0.36	B
> replicate8	1000.00	0.75	12.09	0.74	B
> replicate9	1000.00	0.75	12.09	0.84	C
> replicate10	1000.00	0.98	0.65	0.50	C
> replicate11	2.00	6.00	6.00	2.00	C
> replicate12	6.00	6.00	2.00	6.00	C
> Using "aov()" I can run a test on each column. but I would
> like to run the ANOVAs for each colum (that in my case are hundreds)
> in an automated way. 
> sample1 ok
> sample2 ok
> sample3 not significant
> ....

	FUN=function(x) summary.aov(aov(x~sample.df$group))[[1]][1,"Pr(>F)"])

sapply applies a function to each 'column' of a dataframe, returning
the result as a vector.

FUN=function(x) ... is an anonymous function that inserts the column
of the dataframe into the following function for each column as sapply
loops through them.

summary.aov(...) produces a list of tables, although in this case the
list is only one table long. [[1]][1,"Pr(>F)"] extracts the p-value
from the first row of the first table.

The result for your example is:


   sample1    sample2    sample3    sample4
0.09961436 0.04405756 0.49289026 0.67389417


Tyler Smith

More information about the R-help mailing list