[R] How to perform a grouped shapiro wilk test on dataframe

arun smartpink111 at yahoo.com
Fri Apr 5 23:03:19 CEST 2013


Hi,
library(plyr)
res<-ddply(dat1,.(ACTIVITY), summarise, cbind(if(length(unique(COUNTS))==1) NA else shapiro.test(COUNTS)$p.value, if(length(unique(COUNTS))==1) NA else shapiro.test(COUNTS)$statistic))
res1<- data.frame(ACTIVITY=res[,1],as.data.frame(res[,2]),stringsAsFactors=FALSE)
names(res1)[2:3]<-c("Pvalue","stats")
 res1
#   ACTIVITY       Pvalue     stats
#1 activity1           NA        NA
#2 activity2 8.025059e-11 0.4588439
#3 activity3 3.760396e-07 0.7282838
str(res1)
#'data.frame':    3 obs. of  3 variables:
# $ ACTIVITY: chr  "activity1" "activity2" "activity3"
# $ Pvalue  : num  NA 8.03e-11 3.76e-07
# $ stats   : num  NA 0.459 0.728

#or
res2<-ddply(dat1,.(ACTIVITY), summarise, value=c(if(length(unique(COUNTS))==1) NA else shapiro.test(COUNTS)$p.value, if(length(unique(COUNTS))==1) NA else shapiro.test(COUNTS)$statistic))
res2$newCol<-rep(c("Pvalue","stats"),times=nrow(res2)/2)
library(reshape2)
res3<-dcast(res2,ACTIVITY~newCol,value.var="value")
 res3
#   ACTIVITY       Pvalue     stats
#1 activity1           NA        NA
#2 activity2 8.025059e-11 0.4588439
#3 activity3 3.760396e-07 0.7282838

A.K.





----- Original Message -----
From: "Mossadegh, Ramine N." <Ramine.Mossadegh at finra.org>
To: arun <smartpink111 at yahoo.com>
Cc: 
Sent: Friday, April 5, 2013 4:17 PM
Subject: RE: [R] How to perform a grouped shapiro wilk test on dataframe

The statistic & the p.value .  When I do all2 <- as.data.frame(stats), I get lots of garbage in 2nd column like:

list(statistic = 0.0889037906739691, p.value = 6.41197341678277e-20, method = "Shapiro-Wilk normality test", data.name = "x")

Thanks

-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com] 
Sent: Friday, April 05, 2013 4:15 PM
To: Mossadegh, Ramine N.
Subject: Re: [R] How to perform a grouped shapiro wilk test on dataframe

It depends upon what results you want to put into dataframe.





----- Original Message -----
From: "Mossadegh, Ramine N." <Ramine.Mossadegh at finra.org>
To: arun <smartpink111 at yahoo.com>
Cc: 
Sent: Friday, April 5, 2013 3:12 PM
Subject: RE: [R] How to perform a grouped shapiro wilk test on dataframe

Thanks it now works but how can I put the results back in a data frame?

-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com] 
Sent: Friday, April 05, 2013 2:35 PM
To: Mossadegh, Ramine N.
Cc: R help
Subject: Re: [R] How to perform a grouped shapiro wilk test on dataframe

Hi,
Try this:
dat1<- read.csv("sample.csv",sep="\t",stringsAsFactors=FALSE)
 with(dat1,tapply(COUNTS,list(ACTIVITY),function(x) if (length(unique(x))==1) NA else shapiro.test(x)))
#$activity1
#[1] NA

#$activity2

#    Shapiro-Wilk normality test

#data:  x
#W = 0.4588, p-value = 8.025e-11
#

#$activity3
#
    Shapiro-Wilk normality test
#
#data:  x
#W = 0.7283, p-value = 3.76e-07
library(plyr)
ddply(dat1,.(ACTIVITY), summarise, Pval=if(length(unique(COUNTS))==1) NA else shapiro.test(COUNTS)$p.value) #   ACTIVITY         Pval
#1 activity1           NA
#2 activity2 8.025059e-11
#3 activity3 3.760396e-07

A.K.



----- Original Message -----
From: ramoss <ramine.mossadegh at finra.org>
To: r-help at r-project.org
Cc: 
Sent: Friday, April 5, 2013 10:50 AM
Subject: [R] How to perform a grouped shapiro wilk test on dataframe

Hello,

I was wandering if it is possible to perform on a dataframe called 'all' a shapiro wilk normality test for COUNTS by variable Group

ACTIVITY?  Could it be done using plyer?  I saw an eg that applies to an array but not to a dataframe:

lapply(split(dataset1$Height,dataset1$Group),shapiro.test)

Any thoughts would be much appreciated.

My dataframe is in shape:

dat       ACTIVIT  COUNTS
1/1/13   XXXX      43
..
..
1/31/13 XXXX    60
1/1/13   YYYY     40
..
..
1/31/13 YYYY  10
etc  going for 3 months.



--
View this message in context: http://r.789695.n4.nabble.com/How-to-perform-a-grouped-shapiro-wilk-test-on-dataframe-tp4663438.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Confidentiality Notice:  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you

Confidentiality Notice:  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you



More information about the R-help mailing list