[BioC] Human Gene array data analysis workflow

Javier Pérez Florido jpflorido at gmail.com
Thu Apr 28 23:10:58 CEST 2011


Thanks Christian,
But are also correct the other filter steps (the ones applied to 3'IVT) 
for Gene arrays?

Thanks,
Javier


On 28/04/2011 23:05, cstrato wrote:
> Dear Javier,
>
> In principle every workflow for Exon arrays can also be applied to 
> Gene arrays.
>
> One more note:
> In principle you could use package "xps" for all these steps:
>
> - rma(.., exonlevel="core") will only use the core genes but not AFFX 
> or control genes
>
> - PreFilter(mad=c(0.5,0.01)) etc will eliminate all transcripts with 
> low variability
>
> For more details see e.g. example script "xps/examples/script4exon.R" 
> which shows you the workflow for HuExon and HuGene arrays.
>
> Best regards
> Christian
> _._._._._._._._._._._._._._._._._._
> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a
> V.i.e.n.n.a A.u.s.t.r.i.a
> e.m.a.i.l: cstrato at aon.at
> _._._._._._._._._._._._._._._._._._
>
>
> On 4/28/11 7:50 PM, Javier Pérez Florido wrote:
>> Dear list,
>> A possible data analysis workflow for EXON arrays could be as follows
>> (extracted from "Exon Array data analysis using Affymetrix Power Tools
>> and R statistical software", Briefings in Bioinformatics):
>>
>>      * Normalization and summarization (at exon or gene-level) of the
>>        array set.
>>      * Quality control of exon array data of summarization results (to
>>        remove possible outliers)
>>      * Specific filtering steps, for example:
>>            o Restrict analysis to core probesets
>>            o Filter for undetected probesets (i.e., undetected exons),
>>              making use of DABG (Detected above background) analysis.
>>            o Filter for cross-hybridizing probesets (exons)
>>            o Filter for genes undetected genes in all groups
>>
>>    I'm running a gene-level data analysis on Human GENE ST 1.0 (not 
>> EXON)
>> arrays, which are, in principle, designed for gene expression profiling,
>> that is, a gene-level analysis. My question is related to the filtering
>> step. I was wondering if, once the normalization and summarization is
>> run at the transcript level (core), giving 33297 transcripts, the
>> following filtering can be run before differential expression analysis:
>>
>>      * Remove control transcripts such as other_spike, AFFX, pos_control
>>        (normgene->exon) and neg_control (normgene->intron). This step
>>        removes around 4156 transcripts
>>      * Remove transcripts with very low variability through varFilter
>>        function (genefilter package)
>>
>> Since these were the steps recommended in "Bioconductor case studies"
>> book for 3'IVT arrays (the controls were different in 3'IVT), I was
>> wondering if these 2 filtering steps can also be used on Human Gene
>> arrays for gene-level analysis or, on the contrary, I have to run the
>> filtering steps described above for EXON arrays.
>> Thanks,
>> Javier
>> P.S. If you know any data analysis workflow document for HuGene arrays,
>> please, let me know
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>



More information about the Bioconductor mailing list