[BioC] Human Gene array data analysis workflow

cstrato cstrato at aon.at
Thu Apr 28 23:23:41 CEST 2011


At the moment I do not know which steps you mean, but in principle, yes: 
After preprocessing you have a dataframe of gene expression levels. It 
does not matter which chip was used.
Christian


On 4/28/11 11:10 PM, Javier Pérez Florido wrote:
> Thanks Christian,
> But are also correct the other filter steps (the ones applied to 3'IVT)
> for Gene arrays?
>
> Thanks,
> Javier
>
>
> On 28/04/2011 23:05, cstrato wrote:
>> Dear Javier,
>>
>> In principle every workflow for Exon arrays can also be applied to
>> Gene arrays.
>>
>> One more note:
>> In principle you could use package "xps" for all these steps:
>>
>> - rma(.., exonlevel="core") will only use the core genes but not AFFX
>> or control genes
>>
>> - PreFilter(mad=c(0.5,0.01)) etc will eliminate all transcripts with
>> low variability
>>
>> For more details see e.g. example script "xps/examples/script4exon.R"
>> which shows you the workflow for HuExon and HuGene arrays.
>>
>> Best regards
>> Christian
>> _._._._._._._._._._._._._._._._._._
>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a
>> V.i.e.n.n.a A.u.s.t.r.i.a
>> e.m.a.i.l: cstrato at aon.at
>> _._._._._._._._._._._._._._._._._._
>>
>>
>> On 4/28/11 7:50 PM, Javier Pérez Florido wrote:
>>> Dear list,
>>> A possible data analysis workflow for EXON arrays could be as follows
>>> (extracted from "Exon Array data analysis using Affymetrix Power Tools
>>> and R statistical software", Briefings in Bioinformatics):
>>>
>>> * Normalization and summarization (at exon or gene-level) of the
>>> array set.
>>> * Quality control of exon array data of summarization results (to
>>> remove possible outliers)
>>> * Specific filtering steps, for example:
>>> o Restrict analysis to core probesets
>>> o Filter for undetected probesets (i.e., undetected exons),
>>> making use of DABG (Detected above background) analysis.
>>> o Filter for cross-hybridizing probesets (exons)
>>> o Filter for genes undetected genes in all groups
>>>
>>> I'm running a gene-level data analysis on Human GENE ST 1.0 (not EXON)
>>> arrays, which are, in principle, designed for gene expression profiling,
>>> that is, a gene-level analysis. My question is related to the filtering
>>> step. I was wondering if, once the normalization and summarization is
>>> run at the transcript level (core), giving 33297 transcripts, the
>>> following filtering can be run before differential expression analysis:
>>>
>>> * Remove control transcripts such as other_spike, AFFX, pos_control
>>> (normgene->exon) and neg_control (normgene->intron). This step
>>> removes around 4156 transcripts
>>> * Remove transcripts with very low variability through varFilter
>>> function (genefilter package)
>>>
>>> Since these were the steps recommended in "Bioconductor case studies"
>>> book for 3'IVT arrays (the controls were different in 3'IVT), I was
>>> wondering if these 2 filtering steps can also be used on Human Gene
>>> arrays for gene-level analysis or, on the contrary, I have to run the
>>> filtering steps described above for EXON arrays.
>>> Thanks,
>>> Javier
>>> P.S. If you know any data analysis workflow document for HuGene arrays,
>>> please, let me know
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>
>



More information about the Bioconductor mailing list