[BioC] Analyzing expression Affymetrix Hugene1.0.st array

James W. MacDonald jmacdon at uw.edu
Fri Sep 28 16:48:10 CEST 2012

Hi Juan,

On 9/28/2012 6:10 AM, Juan Fernández Tajes wrote:
> Dear List,
> I´m working with expression data obtained from Affymetrix HuGene 1.0 st array. I´m interested in knowing how many genes are expressed in chromosome 16. Surprisingly, all the genes included (808) in the array and mapped to chromosome have expression values (from 2.01 to 12.4), can I conclude that all these genes are expressed in this tissue?

Not really. Microarrays are not suitable for determining if a gene is 
being expressed or not. The only use IMO of microarray data is to 
determine if a gene is *differentially* expressed. This is what Benilton 
is getting at in his response to your question.

The expression values we generate from a set of microarrays are very far 
removed from the actual amount of mRNA that existed in the samples we 
are measuring, and have undergone quite a bit of manipulation. In 
addition, there is quite a bit of technical noise introduced in each 
step of the process. So the best we can hope for is that the expression 
value for a given gene is proportional to the amount of mRNA that 
existed in the original sample, but not that we are quantifying the 
amount of mRNA.

In addition, the expression values are based off of data from a 16 bit 
TIFF image. So the values have a maximum range from 2^0 - 2^16, or 
1-65535 on the natural scale. Given that fact, do you really want to 
contend that a gene with an expression of 2^2.01 is being expressed? 
That expression level is likely not distinguishable from noise. So one 
more difficulty in deciding if a gene is expressed is deciding at which 
point you can distinguish signal from underlying noise.



> Many thanks in advance
> Here is my code:
> geneCELs.N<- list.celfiles(getwd(), full.names=T)
> affyGeneFS.N<- read.celfiles(geneCELs.N)
> myAB.N<- affyGeneFS.N
> sampleNames(myAB.N)<- sub("\\.CEL$", "", sampleNames(myAB.N))
> sampleNames(myAB.N)<- sub("\\.CEL$", "", sampleNames(myAB.N))
> metadata_array.N<- read.delim(file="metadata.txt", header=T, sep="\t")
> rownames(metadata_array.N)<- metadata_array.N$Sample_ID
> phenoData(myAB.N)<- new("AnnotatedDataFrame", data=metadata_array.N)
> myAB.N_rma<- rma(myAB.N, target="core")
> annotation(myAB.N_rma)<- "hugene10sttranscriptcluster.db"
> ppc<- function(x) paste("^", x, sep="")
> myFindMap<- function(mapEnv, which){
> myg<- ppc(which)
> a1 = eapply(mapEnv, function(x)
> grep(myg, x, value=T))
> unlist(a1)
> }
> chr16.N<- myFindMap(hugene10sttranscriptclusterCHR, 16)
> chr16.N<- as.data.frame(chr16.N)
> chr16.N$probes<- rownames(chr16.N)
> probes.chr16.N<- chr16.N$probes
> sel.N<- match(probes.chr16.N, featureNames(myAB.N_rma), nomatch=0)
> es2_chr16.N<- myAB.N_rma[sel.N,]
> data.exprs.N<- as.data.frame(exprs(es2_chr16.N))
> g.N<- featureNames(es2_chr16.N)
> linked.N<- links(hugene10sttranscriptclusterSYMBOL)
> data.exprs.N.symbol<- merge(data.exprs.N, linked.N, by.x="row.names", by.y="probe_id")
> row.names(data.exprs.N.symbol)<- data.exprs.N.symbol[[1]]
> data.exprs.N.symbol<- data.exprs.N.symbol[, -1]
> data.exprs.N.symbol$Mean.Exprs<- rowMeans(data.exprs.N.symbol[, 1:12])
> Juan
> ---------------------------------------------------------------
> Juan Fernandez Tajes, ph. D
> Departamento de Biología Celular y Molecular
> Facultad de Ciencias-Universidade da Coruña
> Tlf. +34 981 167000 ext 2030
> e-mail: jfernandezt at udc.es
> ----------------------------------------------------------------
> 	[[alternative HTML version deleted]]
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

James W. MacDonald, M.S.
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099

More information about the Bioconductor mailing list