[R] : automated levene test and other tests for variable datasets

Joachim Audenaert Joachim.Audenaert at pcsierteelt.be
Wed Apr 15 14:25:51 CEST 2015


Hello Michael,

thank you for the reply, it realy helped me to simplify my script. 
Basically all my questions are a bit the same, but with your hint I could 
solve most of my problems. 

Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, België
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audenaert at pcsierteelt.be | W: www.pcsierteelt.be 



From:   Michael Dewey <lists at dewey.myzen.co.uk>
To:     Joachim Audenaert <Joachim.Audenaert at pcsierteelt.be>, 
r-help at r-project.org
Date:   14/04/2015 18:17
Subject:        Re: [R] : automated levene test and other tests for 
variable datasets



You ask quite a lot of questions, I have given some hints about your 
first example inline

On 14/04/2015 09:07, Joachim Audenaert wrote:
> Hello all,
>
> I am writing a script for statistical comparison of means. I'm doing 
many
> field trials with plants, where we have to compare the efficacy of
> different treatments on, different groups of plants. Therefore I would
> like to automate this script so it can be used for different datasets of
> different experiments (which will have different dimensions). An example
> dataset is given here under, I would like to compare if the data of 5
> columns (A,B,C,D,E) are statistically different from each other, where 
A,
> B, C, D and A are different treatments of my plants and I have 5
> replications for this experiment
>
> dataset <- structure(list(A = c(62, 55, 57, 103, 59), B = c(36, 24, 61,
> 19, 79), C = c(33, 97, 54, 48, 166), D = c(106, 82, 116, 85, 94), E =
> c(32, 16, 9, 7, 46)), .Names = c("A", "B", "C", "D",    "E"), row.names 
=
> c(NA, 5L), class = "data.frame")
>
> 1) First I would like to do a levene test to check the equality of
> variances of my datasets. Currently I do this as follows:
>
> library("car")
> attach(dataset)
Usually best to avoid this and use the data=parameter or with or within

> y <- c(A,B,C,D,E)
you could use unlist( ) here
> group <- as.factor(c(rep(1, length(A)), rep(2, length(B)),rep(3,
> length(C)), rep(4, length(D)),rep(5, length(E))))
you can get the lengths which you need with
lengtha <- lapply(dataset, length)
or
lengths <- sapply(dataset, length)
depending

then
rep(letters[1:length(lengths)], lengths)
should get you the group variable you want.


I have just typed all those in so there may be typos but at least you 
know where to look. I am not suggesting that I think automating all 
statistical analyses is necessarily a good idea either.

> leveneTest(y, group)
>
> Is there a way to automate this for all types of datasets, so that I can
> use the same script for a datasets with any number of columns of data to
> compare? My above script only works for a dataset with 5 columns to
> compare
>
> 2) For my boxplots I use
>
> boxplot(dataset)
>
> which gives me all the boxplots of each dataset, so this is how I want 
it
>
> 3) To check normality I currently use the kolmogorov smirnov test as
> follows
>
> ks.test(A,pnorm)
> ks.test(B,pnorm)
> ks.test(C,pnorm)
> ks.test(D,pnorm)
> ks.test(E,pnorm)
>
> Is there a way to replace the A, B, C, ... on the five lines into one 
line
> of entry so that the kolmogorov smirnov test is done on all columns of 
my
> dataset at once?
>
> 4) if data is normally distributed and the variances are equal I want to
> do a t-test and do pairwise comparison, currently like this
>
> pairwise.t.test(y,group,p.adjust.method = "none")
>
> if data is not normally distributed or variances are unequal I do a
> pairwise comparison with the wilcoxon test
>
> pairwise.wilcox.test(y,group,p.adjust.method = "none")
>
> But again I would like to make this easier, is there a way to replace 
the
> y and group in my datalineby something so it works for any size of
> dataset?
>
> 5) Once I have my paiwise comparison results I know which groups are
> statistically different from others, so I can add a and b and c to
> different groups in my graph. Currently I do this on a sheet of paper by
> comparing them one by one. Is there also a way to automate this? So R
> gives me for example something like this
>
> A: a
> B: a
> C: b
> D: ab
> E: c
>
> All help and commentys are welcome. I'm quite new to R and not a
> statistical genious, so if I'm overseeing things or thinking in a wrong
> way please let me know how I can improve my way of working. In short I
> would like to build a script that can compare the means of different
> groups of data and check if they are statistically diiferent
>
> Met vriendelijke groeten - With kind regards,
>
> Joachim Audenaert
> onderzoeker gewasbescherming - crop protection researcher
>
> PCS | proefcentrum voor sierteelt - ornamental plant research
>
> Schaessestraat 18, 9070 Destelbergen, Belgi�
> T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
> E: joachim.audenaert at pcsierteelt.be | W: www.pcsierteelt.be
>
> Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het
> PCS op LinkedIn
> Disclaimer | Please consider the environment before printing. Think 
green,
> keep it on the screen!
>                [[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Michael
http://www.dewey.myzen.co.uk/home.html



Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het 
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green, 
keep it on the screen!

	[[alternative HTML version deleted]]



More information about the R-help mailing list