[BioC] Difference number of probesets after RMA with Oligo package and APT

Paul Geeleher paulgeeleher at gmail.com
Fri Oct 26 11:21:58 CEST 2012


I'm not sure how you calculated the number of probesets, but I noticed
that you are listing more probesets for "extended" than "full" for
APT. You may have gone wrong here somewhere, as there should be more
probesets in "full" than in "extended". See here:
http://www.affymetrix.com/support/help/exon_glossary/index.affx

I.e. "full" includes all the "extended" probesets, but also adds some
from computational predictions....

Paul.

On Fri, Oct 26, 2012 at 8:39 AM, Sophie Lamarre
<sophie.lamarre at insa-toulouse.fr> wrote:
> Hello,
>
> I work on Affymetrix Human Exon 1.0 ST.
>
> I trie different methods to analyze this type of microarrays. For the
> moment, I tried APT and Oligo package:
>   - I did a RMA in APT with this code:
> apt-probeset-summarize -a rma -p HuEx-1_0-st-v2.r2.pgf -c
> HuEx-1_0-st-v2.r2.clf -s HuEx-1_0-st-v2.r2.dt1.hg18.core.ps
> -qc-probesets HuEx-1_0-st-v2.r2.qcc -o OUT_EXON_CORE *.CEL
>   - I did a RMA in Oligo package with this code:
> library("oligo")
> library("pd.huex.1.0.st.v2")
> celFiles <- c("Data GSE24976/GSM613529.CEL",
>                "Data GSE24976/GSM613530.CEL",
>                "Data GSE24976/GSM613531.CEL",
>                "Data GSE24976/GSM613532.CEL",
>                "Data GSE24976/GSM613533.CEL",
>                "Data GSE24976/GSM613534.CEL",
>                "Data GSE24976/GSM613535.CEL",
>                "Data GSE24976/GSM613536.CEL")
> core_rma = rma(data_base, target = "core")
> data_base_rma = exprs(core_rma)
>
> When I compare the difference of probesets between the normalized file
> with Oligo package and the normalized file with APT, there is a big
> difference:
> - At CORE level:
>    * APT: 287 329 probesets
>    * Oligo package: 22 011 probesets
> - At FULL level:
>    * APT: 1 384 231 probesets
>    * Oligo package: 266 405 probesets
> - At EXTENDED level:
>    * APT: 807 038 probesets
>    * Oligo package: 133 672 probesets
>
> Why this difference? This difference of probesets between these two
> methods could be explained by an incomplete information in
> pd.huex.1.0.st.v2 package?
>
> Oligo package is really recommanded to analyse Affymetrix Exon Array?
> Which package can I use in order to analyze Affymetrix Exon Array at
> CORE/FULL/EXTENDED level?
>
> Thank you for advance for your answer and your help.
>
>
> ------------------------------
>
> My Session Info:
>
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> LC_TIME=en_US.UTF-8
>   [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C                  LC_ADDRESS=C
> [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] grid      stats     graphics  grDevices utils     datasets
> methods   base
>
> other attached packages:
>   [1] pd.huex.1.0.st.v2_3.8.0 RSQLite_0.11.2          DBI_0.2-5
>   [4] oligo_1.22.0            oligoClasses_1.20.0     affy_1.36.0
>   [7] gplots_2.11.0           MASS_7.3-18             KernSmooth_2.23-7
> [10] caTools_1.13            bitops_1.0-4.1          gdata_2.12.0
> [13] gtools_2.7.0            geneplotter_1.36.0      lattice_0.20-6
> [16] annotate_1.36.0         AnnotationDbi_1.20.2    Biobase_2.18.0
> [19] BiocGenerics_0.4.0      limma_3.14.1
>
> loaded via a namespace (and not attached):
>   [1] affxparser_1.30.0     affyio_1.26.0         BiocInstaller_1.8.3
>   [4] Biostrings_2.26.2     bit_1.1-9             codetools_0.2-8
>   [7] ff_2.2-9              foreach_1.4.0         GenomicRanges_1.10.2
> [10] IRanges_1.16.3        iterators_1.0.6       parallel_2.15.1
> [13] preprocessCore_1.20.0 RColorBrewer_1.0-5    splines_2.15.1
> [16] stats4_2.15.1         tools_2.15.1          XML_3.95-0.1
> [19] xtable_1.7-0          zlibbioc_1.4.0
>
> --
>
> Sophie LAMARRE
>
> Statistician - FRANCE
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



-- 
Dr. Paul Geeleher
School of Mathematics, Statistics and Applied Mathematics
National University of Ireland
Galway
Ireland
--
www.bioinformaticstutorials.com



More information about the Bioconductor mailing list