[BioC] How to do Affy ST array analysis

Wed May 7 14:57:09 CEST 2014

Hi Ninni,

I guess a very simple workflow would be:

1.read celfiles 
library(oligo)
rawData = read.celfiles(< character vector of celfiles >)

2. perform RMA and get "transcript cluster" summarized data back
 using only "core" genes ("safely" annotated genes according to affy)
 this is the default in oligo.

Eset = rma(rawData,target="core")

3. Load annotation package and annotate "transcript clusters" with some
stuff contained in that package.

## load Annotation package
library("hugene20sttranscriptcluster.db")

	annotateGene = function ( db , what , missing ) {
	tab = toTable(db[intersect(featureNames(Eset),  mappedkeys(db)) ])
	mt = match ( featureNames ( Eset ) , tab$probe_id )
	ifelse ( is.na(mt), missing , tab[[ what ]][ mt ])
	}

fData(Eset)$symbol = annotateGene( hugene20sttranscriptclusterSYMBOL ,"symbol" , missing = NA )
fData(Eset)$genename = annotateGene( hugene20sttranscriptclusterGENENAME , "gene_name" , missing = NA )
fData(Eset)$ensembl = annotateGene( hugene20sttranscriptclusterENSEMBL , "ensembl_id" , missing = NA )

4. After that keep only the "transcript clusters"  that have a ENSEMBL Gene ID. 
(for example)

Hope that helps,

Bernd

On Wed,  7 May 2014 05:06:00 -0700 (PDT)
"Ninni Nahm \[guest\]" <guest at bioconductor.org> wrote:

> 
> Hi all!
> 
> I am feeling a little bit stupid, but I have been searching for two days now (maybe I search wrong?!) and could not figure it out.
> I want to analyze a Human Gene st array. 
> I know that there is the oligo package, I found this annotation package here pd.hugene.2.0.st, but, I do not know how to do the steps. I am used to the affy package and affy pipelines. 
> All I find when searching for solutions are ways on how to make your own annotation package, that is not necessary, I think, because I found the pd.hugene.2.0.st. Or am I wrong? Somehow I can t use it in the same way as I do with the for example hgu133a.db package that provides me the annotations.
> 
> Im really lost... 
> 
> I want to do:
> 
> - probe level analysis (similar to affyplm)
> - RMA normalization (Somehow oligo does this, I think)
> - Filter probes that are controls (as one does with affy: AFFX, for hgu133a)
> - annotation of probesets (normally, I would use the IQR filter to get unique entrez ids, but how do I do this with the ST array?)
> 
> 
> I know that there is something about probe and transcript to be aware of and core? But I cannot connect the workflow. 
> 
> I would be so happy if someone helped me, pointed me to the right docs. (the oligo userguide is not so helpful for me because I still dont understand what to do with what and when...) Sorry!
> 
> Thanks!
> 
> Ninni
> 
>  -- output of sessionInfo(): 
> 
> -
> 
> --
> Sent via the guest posting facility at bioconductor.org.
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor