[BioC] Non-specific filtering of Affymetrix Microarray data

Wolfgang Huber whuber at embl.de
Wed Feb 19 16:54:04 CET 2014


Hi Vinay

a look in the man page of ‘nsFilter’ indicates that its output is a list, one of whose elements is ‘ eset’, the filtered ExpressionSet. 
You could try (I haven’t checked) with 
  selected<-genefilter(celfiles_filtered$est, ff) 

But I also wonder why you would want to do this?
DId you explore the ' var.cutoff’, ‘filterByQuantile’ arguments of ‘nsFilter’?

	Wolfgang
	


On 18 Feb 2014, at 05:07, Vinay Randhawa [guest] <guest at bioconductor.org> wrote:

> 
> During non-specific filtering, I am using parameters for filtering probes (require.entrez=TRUE, remove.dupEntrez=TRUE,feature.exclude="^AFFX) in addition to the filters of intensity and variance. Independently, both filters works fine, but when I try to use them together, I am getting an error written below:
> Error in apply(expr, 1, flist) : dim(X) must have a positive length
> 
> 
> Please help me with this.
> 
> 
> I have pasted the code below.
> 
> #1.Getting the data
> source("http://bioconductor.org/biocLite.R")
> biocLite("GEOquery")
> biocLite("affycoretools")
> library(GEOquery)
> setwd("/home/vinay/R/R-3.0.2")
> getGEOSuppFiles("GSE6631")
> setwd("/home/vinay/R/R-3.0.2/GSE6631")
> 
> system("tar -xvf GSE6631_RAW.tar")
> cels <- list.files( pattern = "[gz]")
> sapply(cels, gunzip)
> 
> #2.Loading and normalising the data using GC-RMA
> # You may need to copy your phenodata.txt file into the GSE6631 folder 
> library(affy)
> library(affycoretools)
> data <- ReadAffy()
> pData(data)<-read.table("phenodata.txt", header=T,row.names=1, sep="\t")
> pData(data)
> eset <- gcrma(data)
> eset
> dim(eset)
> pData(eset)
> write.exprs(eset, file="Expression_values_GCRMA_normalize.xls")
> eset2<-eset[,pData(eset)[,"Condition"]%in%c("Normal","Cancer")] 
> 
> 
> #3. Non-specific Filtering data
> library(genefilter)
> celfiles_filtered <- nsFilter(eset2, require.entrez=TRUE, remove.dupEntrez=TRUE,feature.exclude="^AFFX")
> f1<-pOverA(0.10,log2(100))  #intensity filter-the intensity of a gene should be above log2(100) in at least 25 percent of the samples
> f2<-function(x)(IQR(x)>0.5)  #variance filter-the interquartile range of log2–intensities should be at least 0.5
> ff<-filterfun(f1,f2)
> selected<-genefilter(celfiles_filtered,ff)
> 
> 
> 
> 
> 
> 
> -- output of sessionInfo(): 
> 
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-unknown-linux-gnu (64-bit)
> 
> locale:
> [1] LC_CTYPE=en_IN       LC_NUMERIC=C         LC_TIME=en_IN       
> [4] LC_COLLATE=en_IN     LC_MONETARY=en_IN    LC_MESSAGES=en_IN   
> [7] LC_PAPER=en_IN       LC_NAME=C            LC_ADDRESS=C        
> [10] LC_TELEPHONE=C       LC_MEASUREMENT=en_IN LC_IDENTIFICATION=C 
> 
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods  
> [8] base     
> 
> other attached packages:
> [1] hgu95av2.db_2.10.1         org.Hs.eg.db_2.10.1       
> [3] arrayQualityMetrics_3.18.0 affyPLM_1.38.0            
> [5] preprocessCore_1.24.0      RColorBrewer_1.0-5        
> [7] hgu95av2probe_2.13.0       affycoretools_1.34.0      
> [9] KEGG.db_2.10.1             GO.db_2.10.1              
> [11] RSQLite_0.11.4             DBI_0.2-7                 
> [13] limma_3.18.12              hgu95av2cdf_2.13.0        
> [15] AnnotationDbi_1.24.0       simpleaffy_2.38.0         
> [17] genefilter_1.44.0          gcrma_2.34.0              
> [19] affy_1.40.0                GEOquery_2.28.0           
> [21] Biobase_2.22.0             BiocGenerics_0.8.0        
> [23] BiocInstaller_1.12.0      
> 
> loaded via a namespace (and not attached):
> [1] affyio_1.30.0            annaffy_1.34.0           annotate_1.40.0         
> [4] AnnotationForge_1.4.4    beadarray_2.12.0         BeadDataPackR_1.14.0    
> [7] biomaRt_2.18.0           Biostrings_2.30.1        biovizBase_1.10.7       
> [10] bit_1.1-11               bitops_1.0-6             BSgenome_1.30.0         
> [13] Cairo_1.5-5              Category_2.28.0          caTools_1.16            
> [16] cluster_1.14.4           codetools_0.2-8          colorspace_1.2-4        
> [19] DESeq2_1.2.10            dichromat_2.0-0          digest_0.6.4            
> [22] edgeR_3.4.2              ff_2.2-12                foreach_1.4.1           
> [25] Formula_1.1-1            gdata_2.13.2             GenomicFeatures_1.14.2  
> [28] GenomicRanges_1.14.4     ggbio_1.10.11            ggplot2_0.9.3.1         
> [31] GOstats_2.28.0           gplots_2.12.1            graph_1.40.1            
> [34] grid_3.0.2               gridExtra_0.9.1          GSEABase_1.24.0         
> [37] gtable_0.1.2             gtools_3.3.0             Hmisc_3.14-0            
> [40] hwriter_1.3              IRanges_1.20.6           iterators_1.0.6         
> [43] KernSmooth_2.23-10       labeling_0.2             lattice_0.20-24         
> [46] latticeExtra_0.6-26      locfit_1.5-9.1           MASS_7.3-29             
> [49] Matrix_1.1-2             munsell_0.4.2            oligoClasses_1.24.0     
> [52] PFAM.db_2.10.1           plyr_1.8                 proto_0.3-10            
> [55] R2HTML_2.2.1             RBGL_1.38.0              Rcpp_0.11.0             
> [58] RcppArmadillo_0.4.000.2  RCurl_1.95-4.1           ReportingTools_2.2.0    
> [61] reshape2_1.2.2           R.methodsS3_1.6.1        R.oo_1.17.0             
> [64] Rsamtools_1.14.3         rtracklayer_1.22.3       R.utils_1.29.8          
> [67] scales_0.2.3             setRNG_2011.11-2         splines_3.0.2           
> [70] stats4_3.0.2             stringr_0.6.2            survival_2.37-7         
> [73] SVGAnnotation_0.93-1     tcltk_3.0.2              tools_3.0.2             
> [76] VariantAnnotation_1.8.12 vsn_3.30.0               XML_3.98-1.1            
> [79] xtable_1.7-1             XVector_0.2.0            zlibbioc_1.8.0          
>> 
> 
> 
> --
> Sent via the guest posting facility at bioconductor.org.
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list