[BioC] How can I remove control probesets from the expressionset object in gene expression analysis with Affy Human Gene 1.0ST microarray

James W. MacDonald jmacdon at uw.edu
Tue May 8 18:32:43 CEST 2012


Hi Juan,

On 5/8/2012 11:14 AM, Juan Fernández Tajes wrote:
> Sorry for take our first conversation off-list, I just forgot to click 
> reply all. I already changed the annotation slot to: 
> annotation(myAB1_rma) <- "hugene11sttranscriptcluster.db", as you has 
> suggested. However, when I execute the nsFilter command:
>
> >filt_myAB1 = nsFilter(myAB1_rma,var.func=IQR, 
> var.cutoff=0.5,feature.exclude=IDs)
>
> In the filter.log I find this:
>
> $filter.log
> $filter.log$numDupsRemoved
> [1] 2013
>
> $filter.log$numLowVar
> [1] 9968
>
> $filter.log$numRemoved.ENTREZID
> [1] 11348
>
> I don´t find a filter.log$feature.exclude, I´ve also tried converting 
> object IDs to character but it doesn´t work neither

OK, but what do you get when you do

sum(as.character(IDs) %in% featureNames(filt_myAB1$eset))

if it is 0, then you have filtered them all out.

Best,

Jim


>
> Thanks a lot
>
> Juan
>
> ---------------------------------------------------------------
> Juan Fernandez Tajes, ph. D
> Grupo XENOMAR
> Departamento de Biología Celular y Molecular
> Facultad de Ciencias-Universidade da Coruña
> Tlf. +34 981 167000 ext 2030
> e-mail: jfernandezt at udc.es
> ----------------------------------------------------------------
>
>
> ------------------------------------------------------------------------
> *De: *"James W. MacDonald" <jmacdon at uw.edu>
> *Para: *"Juan Fernández Tajes" <jfernandezt at udc.es>
> *CC: *Bioconductor at r-project.org
> *Enviados: *Martes, 8 de Mayo 2012 17:04:49
> *Asunto: *Re: [BioC] How can I remove control probesets from the 
> expressionset object in gene expression analysis with Affy Human Gene 
> 1.0ST microarray
>
> Hi Juan,
>
> Please don't take conversations off-list. We like to think of the
> archives as a resource, and if conversations are taken off-list, that
> purpose is subverted.
>
> On 5/8/2012 10:30 AM, Juan Fernández Tajes wrote:
> > Dear James,
> >
> > Thanks for your quick answer here is the code that I´ve used and the
> > sessionInfo:
> >
> > >library(pd.hugene.1.1.st.v1)
> > >library(genefilter)
> > >library(oligo)
> > >library(limma)
> > >tab <- dbGetQuery(con, "select * from featureSet;")
>
> Note that you could do this more elegantly:
>
> probes.control <- dbGetQuery(con, "select fsetid from featureSet where
> type in ('2','4','6','7');")[,1]
>
> > >probes.control <- subset(tab, tab$type=="2" | tab$type=="4" |
> > tab$type=="6" | tab$type=="7")
> > >IDs <- probes.control$fsetid
> > >geneCELs <- list.celfiles("./CEL", full.names=T)
> > >affyGeneFS <- read.celfiles(geneCELs)
> > >myAB <- affyGeneFS
> > >sampleNames(myAB) <- sub("\\.CEL$", "", sampleNames(myAB))
> > >metadata_array <- read.delim(file="metadata_array_oligo.txt",
> > header=T, sep="\t")
> > >rownames(metadata_array) <- metadata_array$Sample_ID
> > >phenoData(myAB) <- new("AnnotatedDataFrame", data=metadata_array)
> > >myAB1 <- myAB[, -10]
> > >myAB1_rma <- rma(myAB1, target="core")
> > >filt_myAB1 = nsFilter(myAB1_rma,var.func=IQR,
> > var.cutoff=0.5,feature.exclude=IDs)$eset
> >
> > And here is the error that I find when trying to filter:
> > >filt_myAB1 = nsFilter(myAB1_rma,var.func=IQR,
> > var.cutoff=0.5,feature.exclude=IDs)$eset
> > Error en get(mapName, envir = pkgEnv, inherits = FALSE) :
> >   object 'pd.hugene.1.1.st.v1_dbconn' not found
>
> That is because nsFilter expects a different annotation package. So you
> need to change the annotation slot of your GeneFeatureSet:
>
> annotation(myAB1_rma) <- "hugene11sttranscriptcluster.db"
>
> And you will likely need to do
>
> library(BiocInstaller)
> biocLite("hugene11sttranscriptcluster.db")
>
> first.
>
> Best,
>
> Jim
>
>
> >
> > And here is the sessioInfo()
> >
> > R version 2.14.0 (2011-10-31)
> > Platform: i386-apple-darwin9.8.0/i386 (32-bit)
> >
> > locale:
> > [1] es_ES.UTF-8/es_ES.UTF-8/es_ES.UTF-8/C/es_ES.UTF-8/es_ES.UTF-8
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > other attached packages:
> > [1] limma_3.10.3              genefilter_1.36.0
> > pd.hugene.1.1.st.v1_3.4.0 oligo_1.18.1              oligoClasses_1.16.0
> > [6] Biobase_2.14.0            RSQLite_0.11.1            DBI_0.2-5
> >
> > loaded via a namespace (and not attached):
> >  [1] affxparser_1.26.4     affyio_1.22.0         annotate_1.32.3
> > AnnotationDbi_1.16.19 Biostrings_2.22.0     bit_1.1-8
> >  [7] ff_2.2-6              IRanges_1.12.6        preprocessCore_1.16.0
> > splines_2.14.0        survival_2.36-14      tools_2.14.0
> > [13] xtable_1.7-0          zlibbioc_1.0.1
> >
> > BW,
> >
> > Juan
> >
> >
> > ---------------------------------------------------------------
> > Juan Fernandez Tajes, ph. D
> > Grupo XENOMAR
> > Departamento de Biología Celular y Molecular
> > Facultad de Ciencias-Universidade da Coruña
> > Tlf. +34 981 167000 ext 2030
> > e-mail: jfernandezt at udc.es
> > ----------------------------------------------------------------
> >
> >
> > ------------------------------------------------------------------------
> > *De: *"James W. MacDonald" <jmacdon at uw.edu>
> > *Para: *"Juan Fernández Tajes" <jfernandezt at udc.es>
> > *CC: *bioconductor at r-project.org
> > *Enviados: *Martes, 8 de Mayo 2012 15:25:41
> > *Asunto: *Re: [BioC] How can I remove control probesets from the
> > expressionset object in gene expression analysis with Affy Human Gene
> > 1.0ST        microarray
> >
> > Hi Juan,
> >
> > On 5/8/2012 6:46 AM, Juan Fernández Tajes wrote:
> > > Dear Bioconductor subcribers:
> > >
> > > First of all, I apologize for using a old-resolved bioconductor´s
> > thread:https://stat.ethz.ch/pipermail/bioconductor/2011-June/039993.html
> > >
> > > " Dear list,> >  I am quite new to R as well as to microarray
> > analysis.>  I am dealing with some gene expression analysis performed
> > on Affymetrix Human>  Gene 1.0ST microarray.> >  So far, I have learnt
> > how to filtrate data using genefilter using nsFilter>  functions.> >
> >  Now, I would like to know how to filter out from the expressionset
> > object all>  the control probesets (~4000) that Affymetrix includes in
> > the microarray (for>  quality assay, normalization, background
> > correction, etc.). However, none of>  the aforementioned functions
> > worked for me.> >  How can I recognize those probesets and remove
> > them? I would like to filter>  them out before statistical analysis
> > with limma package."
> > >
> > > I have been having the same problem when using oligo to analyze the
> > data. It so happens that when I try to filter control probe IDs with
> > nsfilter it doesn´t work properly. Do you know anyway to get around
> > this problem?
> >
> > Probably. But you don't give us anything to go on. What code did you
> > use? What happened? Define 'doesn't work properly'. What is the output
> > from sessionInfo()?
> >
> > Best,
> >
> > Jim
> >
> >
> > >
> > > Many thanks in advance
> > >
> > > Juan
> > >
> > > ---------------------------------------------------------------
> > > Juan Fernandez Tajes, ph. D
> > > Grupo XENOMAR
> > > Departamento de Biología Celular y Molecular
> > > Facultad de Ciencias-Universidade da Coruña
> > > Tlf. +34 981 167000 ext 2030
> > > e-mail: jfernandezt at udc.es
> > > ----------------------------------------------------------------
> > >
> > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > >
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> > --
> > James W. MacDonald, M.S.
> > Biostatistician
> > University of Washington
> > Environmental and Occupational Health Sciences
> > 4225 Roosevelt Way NE, # 100
> > Seattle WA 98105-6099
> >
>
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list