[R] selecting significant predictors from ANOVA result

Petr PIKAL petr.pikal at precheza.cz
Thu Jan 28 10:09:31 CET 2010


Hi

I agree with Bert that what you want to do is, how to say it politely, OK, 
not reasonable.

If p value is significant depends on number of observations. Let assume 
that they are same for each p value.

Then you need your p values in suitable object which you did not reveal to 
us. Again I will assume that it is matrix 75000 x 243, let's call it mat. 
Then you can select elements smaller then some threshold.

Here is a smaller one

mat<-matrix(runif(12),4,3)
mat<-mat/5
daf<-as.data.frame(mat)
daf
            V1          V2         V3
1 0.1833271959 0.182649428 0.16363889
2 0.1160545138 0.095533401 0.09378235
3 0.1622977912 0.005841073 0.08108027
4 0.0006527514 0.064333027 0.17431492

sapply(daf, function(x) x[x<.1])
$V1
[1] 0.0006527514

$V2
[1] 0.095533401 0.005841073 0.064333027

$V3
[1] 0.09378235 0.08108027

But how do you control which of the significant values have real meaning 
and what you want to do with them is mystery.

Regards
Petr 

r-help-bounces at r-project.org napsal dne 28.01.2010 09:39:29:

> Dear Sir,
>  
> Thanks for your message. My problem is in writing codes. I did ANOVA for 
75000
> response variables (let's say Y) with 243 predictors (let's say 
X-matrix) one 
> by one with "for" loop in R. I stored the p-values of all predictors, 
however,
> i have very huge file because i have pvalues of 243 predictors for all 
75000 
> Y-variables.
> Now, i want to find some codes that autamatically select only 
significant X-
> predictors from the whole list. If you have ideas on that, it will be 
great help.
> Thanks in advances
>  
> Sincerely,
> Ram
> 
> --- On Wed, 1/27/10, Bert Gunter <gunter.berton at gene.com> wrote:
> 
> 
> From: Bert Gunter <gunter.berton at gene.com>
> Subject: RE: [R] selecting significant predictors from ANOVA result
> To: "'ram basnet'" <basnetabc at yahoo.com>, "'R help'" 
<r-help at r-project.org>
> Date: Wednesday, January 27, 2010, 7:56 AM
> 
> 
> Ram:
> 
> You do not say how many cases (rows in your dataset) you have, but I 
suspect
> it may be small (a few hundred, say).
> 
> In any case, what you describe is probably just a complicated way to
> generate random numbers -- it is **highly** unlikely that any 
meaningful,
> replicable scientific results would result from your proposed approach.
> 
> Not surprising -- this appears to be a very difficult data analysis 
issue.
> It is obvious that you have only a minimal statistical background, so I
> would strongly recommend that you find a competent local statistician to
> help you with your work. Remote help from this list is wholly 
inadequate.
> 
> Bert Gunter
> Genentech Nonclinical Statistics
> 
> 
> 
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] 
On
> Behalf Of ram basnet
> Sent: Wednesday, January 27, 2010 2:52 AM
> To: R help
> Subject: [R] selecting significant predictors from ANOVA result
> 
> Dear all,
> 
> I did ANOVA for many response variables (Var1, Var2, ....Var75000), and 
i
> got the result of p-value like below. Now, I want to select those
> predictors, which have pvalue less than or equal to 0.05 for each 
response
> variable. For example, X1, X2, X3, X4, X5 and X6 in case of Var1, and
> similarly, X1, X2.......X5 in case of Var2, only X1 in case of Var3 and 
none
> of the predictors in case of Var4.
> 
> 
> 
> 
> 
> 
> 
> predictors    
> Var1
> Var2
> Var3
> Var4
> 
> X1
> 0.00005
> 0.001
> 0.05
> 0.36
> 
> X2
> 0.0001
> 0.001
> 0.09
> 0.37
> 
> X3
> 0.0002
> 0.005
> 0.13
> 0.38
> 
> X4
> 0.0003
> 0.01
> 0.17
> 0.39
> 
> X5
> 0.01
> 0.05
> 0.21
> 0.4
> 
> X6
> 0.05
> 0.0455
> 0.25
> 0.41
> 
> X7
> 0.038063
> 0.0562
> 0.29
> 0.42
> 
> X8
> 0.04605
> 0.0669
> 0.33
> 0.43
> 
> X9
> 0.054038
> 0.0776
> 0.37
> 0.44
> 
> X10
> 0.062025
> 0.0883
> 0.41
> 0.45
> 
> I have very large data sets (# of response variables = ~75,000). So, i 
need
> some kind of automated procedure. But i have no ideas.
> If i got help from some body, it will be great for me.
> 
> Thanks in advance.
> 
> Sincerely,
> 
> Ram Kumar Basnet,
> Ph. D student
> Wageningen University,
> The Netherlands.
> 
> 
> 
> 
>       
>     [[alternative HTML version deleted]]
> 
> 
> 
> 
> 
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list