[BioC] Analyzing expression Affymetrix Hugene1.0.st array

Fri Sep 28 12:34:23 CEST 2012

Juan,

To call a gene (differentially) 'expressed', you need to compare its
expression to some baseline. The most basic workflow for such task
starts by defining a control group and the group that you want to
analyse (which I'll call here "group of interest"). After
preprocessing all the samples, you prepare a design matrix and fit
linear models to assess the hypothesis of differential expression (ie.
you compare the expression of the group of interest to the expression
of the control group). This gives you (variants of) t-tests, which
combined with a threshold gives you a set of candidates for
differential expression.

That said, you need to define what is the control group for your
experiment and proceed with the statistical procedures for hypothesis
testing.

benilton

On 28 September 2012 11:10, Juan Fernández Tajes <jfernandezt at udc.es> wrote:
> Dear List,
>
> I´m working with expression data obtained from Affymetrix HuGene 1.0 st array. I´m interested in knowing how many genes are expressed in chromosome 16. Surprisingly, all the genes included (808) in the array and mapped to chromosome have expression values (from 2.01 to 12.4), can I conclude that all these genes are expressed in this tissue?
>
> Many thanks in advance
>
> Here is my code:
>
>
> geneCELs.N <- list.celfiles(getwd(), full.names=T)
> affyGeneFS.N <- read.celfiles(geneCELs.N)
> myAB.N <- affyGeneFS.N
> sampleNames(myAB.N) <- sub("\\.CEL$", "", sampleNames(myAB.N))
> sampleNames(myAB.N) <- sub("\\.CEL$", "", sampleNames(myAB.N))
> metadata_array.N <- read.delim(file="metadata.txt", header=T, sep="\t")
> rownames(metadata_array.N) <- metadata_array.N$Sample_ID
> phenoData(myAB.N) <- new("AnnotatedDataFrame", data=metadata_array.N)
> myAB.N_rma <- rma(myAB.N, target="core")
> annotation(myAB.N_rma) <- "hugene10sttranscriptcluster.db"
>
> ppc <- function(x) paste("^", x, sep="")
> myFindMap <- function(mapEnv, which){
> myg <- ppc(which)
> a1 = eapply(mapEnv, function(x)
> grep(myg, x, value=T))
> unlist(a1)
> }
> chr16.N <- myFindMap(hugene10sttranscriptclusterCHR, 16)
> chr16.N <- as.data.frame(chr16.N)
> chr16.N$probes <- rownames(chr16.N)
> probes.chr16.N <- chr16.N$probes
> sel.N <- match(probes.chr16.N, featureNames(myAB.N_rma), nomatch=0)
> es2_chr16.N <- myAB.N_rma[sel.N,]
> data.exprs.N <- as.data.frame(exprs(es2_chr16.N))
> g.N <- featureNames(es2_chr16.N)
> linked.N <- links(hugene10sttranscriptclusterSYMBOL)
> data.exprs.N.symbol <- merge(data.exprs.N, linked.N, by.x="row.names", by.y="probe_id")
> row.names(data.exprs.N.symbol) <- data.exprs.N.symbol[[1]]
> data.exprs.N.symbol <- data.exprs.N.symbol[, -1]
> data.exprs.N.symbol$Mean.Exprs <- rowMeans(data.exprs.N.symbol[, 1:12])
>
>
> Juan
>
>
> ---------------------------------------------------------------
> Juan Fernandez Tajes, ph. D
> Grupo XENOMAR
> Departamento de Biología Celular y Molecular
> Facultad de Ciencias-Universidade da Coruña
> Tlf. +34 981 167000 ext 2030
> e-mail: jfernandezt at udc.es
> ----------------------------------------------------------------
>
>
>
>         [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor