[R] t.test in a loop

Thu Jan 29 08:17:48 CET 2009

Hi 

r-help-bounces at r-project.org napsal dne 28.01.2009 12:57:55:

> On Wed, 28 Jan 2009, Michael Pearmain wrote:
> 
> > Hi All,
> > I've been having a little trouble with creating a loop that will run a 
a
> > series of t.tests for inspection,
> > Below is the code i've tried, and some checks i've looked at.
> >
> > I've used the get(paste()) idea as i was told previously that the use 
of the
> > eval should try and be avoided.
> >
> > I've run a single syntax to check that my systax is correct and works
> > without any problems
> >> t.test(channel.data.train$News~channel.data.train$power)
> >
> > Can anyone offer any advice?
> 
> There's the additional problem that if your code worked it would do 16 
t-tests
> but only report the last one.
> 
> Assuming you want them printed
> 
> for(v in names(channel.data.train)[1:16]) {
>    print(v)
>    print(t.test(channel.data.train[[v]]~channel.data.train$power)
> }
> 
> or
> for(v in names(channel.data.train)[1:16]){
>    test <- eval(bquote(.(v)~power, data=channel.data.train)
>    print(eval(test))
> }
> 
> This sort of use of eval is fairly harmless.

Another option is to use lapply

lapply(channel.data.train[, 1:16], function(x) 
t.test((x)~channel.data.train$power)

Regards
Petr


> 
>         -thomas
> > Many thanks
> >
> > Mike
> >
> >> str(channel.data.train$power)
> > num [1:9913] 0 0 0 0 0 0 0 0 0 0 ...
> >> summary(channel.data.train$power)
> >   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
> > 0.0000  0.0000  0.0000  0.2368  0.0000  1.0000
> >> names(channel.data.train)
> > [1] "News"              "Entertainment"     "Communicate"
> > [4] "Lifestyle"         "Games"             "Music"
> > [7] "Money"             "Celebrity"         "Shopping"
> > [10] "Sport"             "Film"              "Travel"
> > [13] "Cars"              "Property"          "Chat"
> > [16] "Bet.Play.Win"      "config"            "exposed"
> > [19] "site"              "referrer"          "started"
> > [22] "last_viewed"       "num_views"         "secs_since_viewed"
> > [25] "register"          "secs.na"           "power"
> > [28] "tt"
> >> for(i in names(channel.data.train[,c(1:16)])){
> > +
> > 
t.test(get(paste("channel.data.train$",i,"~channel.data.train$power",sep="")))
> > + }
> > Error in get(paste("channel.data.train$", i, 
"~channel.data.train$power",
> > :
> >  variable "channel.data.train$News~channel.data.train$power" was not 
found
> >
> >
> >
> > --
> > Michael Pearmain
> > Senior Analytics Research Specialist
> >
> >
> > Google UK Ltd
> > Belgrave House
> > 76 Buckingham Palace Road
> > London SW1W 9TQ
> > United Kingdom
> > t +44 (0) 2032191684
> > mpearmain at google.com
> >
> > If you received this communication by mistake, please don't forward it 
to
> > anyone else (it may contain confidential or privileged information), 
please
> > erase all copies of it, including all attachments, and please let the 
sender
> > know it went to the wrong person. Thanks.
> >
> >    [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> Thomas Lumley         Assoc. Professor, Biostatistics
> tlumley at u.washington.edu   University of Washington, Seattle
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.