[BioC] Problem with removing duplicated probes of datasets without annotation

Sun Jun 22 10:02:08 CEST 2014

Dear R helpers,

I'm working with the goat dataset with no available annotation db. For this reason, I use the 'genefilter' instead of 'nsFilter' function with ANOVA (p<0.05) (available in 'genefilter' package). The problem is that I have the filtered data with 500 ducplicated probes of which I want to remove.

Due to my limited ability, I cannot figure out how to do them. It would be great if I can either select a probe of each duplicates with lowest p-value or most variance.

Would you please help me with some examples? 

Best Regards,
Kaj

 -- output of sessionInfo(): 

R version 3.1.0 (2014-04-10)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] Biobase_2.24.0      BiocGenerics_0.10.0 genefilter_1.46.1  

loaded via a namespace (and not attached):
 [1] annotate_1.42.0      AnnotationDbi_1.26.0 DBI_0.2-7           
 [4] GenomeInfoDb_1.0.2   IRanges_1.22.9       RSQLite_0.11.4      
 [7] splines_3.1.0        stats4_3.1.0         survival_2.37-7     
[10] tcltk_3.1.0          tools_3.1.0          XML_3.98-1.1        
[13] xtable_1.7-3

--
Sent via the guest posting facility at bioconductor.org.