[BioC] Help Regarding Analysis.

James MacDonald jmacdon at med.umich.edu
Wed Jun 2 17:38:23 CEST 2004

Hi Ashay,

1.) You might try using

dat <- read.affybatch(filenames=list.celfiles())

instead of typing out all those filenames. You could then follow up
with either

eset <- rma(dat)
eset <- gcrma(dat)

And compare with your existing results.

2.) To do the t-test you can use the esApply functionality in Biobase.
There is a vignette for Biobase that should show you what you need to
do. Alternatively, you can use the mt.teststat() function in multtest,
which will be orders of magnitude faster. To use a cutoff of 0.5, you
can do something like this:

index <- abs(vectoroftstats) > 0.5
sig.eset <- affy.es[index,] #this will give an exprSet containing only
'significant' genes



James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109

>>> "Ashay Manjure" <manjure at usc.edu> 06/01/04 09:32PM >>>
Hello All,

Our lab is new to Microarray analysis and we are in the learning stage
of the entire process. We have some CEL files from 4 different Labs.
Lab works on RAT data and we have CEL files from 19 chips of RAE230A.
Another lab works on MICE data using MU11K. Another works on MICE Data
using some chip we don't know as yet and one more lab works on HUMAN
data using HU6800. The main aim to compare some common genes between
these species for a specific purpose. 

The following is the stage we have reached in analyzing the CEL Files
for each lab.  (This is picked from one of the read me file but I
gives a very good understanding of where we have reached.)

What to do when you get the CEL Files?

The processing of probe level data into Gene Expression measures is
by a 4 step process of
1. Background Correction
2. Normalization
3. PM Correction and
4. Summary Expression value computation.

This is done in the software package called R which has the AFFY
Installed from bioconductor.org.

After launching R, following commands were typed in it.

# Loads the Affy Library Files

# read the data using affybatch, Chip-01.CEL etc are the CEL files for
the Finch Experiment.
# We had 19 chips in all in this experiment. Chip used RAE230A,B
> data <-

# Run the Background correction, normalization etc.
# (These were selected with some recommendations from Diana Abudeva
some from articles/papers on the internet.
# 1. Bg Correct Using "RMA2" Method by Irizarry et.al
# 2. Normalization Using Constant "Constant"
# 3. Pm Correction with "PMonly" (PM - Intensity values read for
# 4. Summary Using "LiWong" i.e DChip.
# The Output of all this stored in the affy.es object.
>affy.es <- expresso(data, bgcorrect.method="rma2",
normalize.method="constant", pmcorrect.method="pmonly",

# Write The expression values in a File
> write.exprs(affy.es, file="liwong_normalize-05192004.csv")

We then take the CSV file and open it in MS Excel installed with SAM
(Significant Analysis of Microarrays).
Thru' the output of SAM we have some significant genes. But comparing
the results with our theoretical analysis, we were not happy with it.
some one then suggested us to use T Tests using R itself.

So these are my questions regarding the analysis:

1. Are the Background Correction, normalization, Summary and PM
correction Methods good enough? If not, what sort of combination would
you suggest considering that, data from different specifies will be
compared ultimately?

2. How exactly would one go about the T Test in R (affy) with 0.5 as
cutoff value? I am not sure of what commands to apply to the affy.es
object for the t test.

Thanking you. 


Ashay Manjure
University of Southern California.

Bioconductor mailing list
Bioconductor at stat.math.ethz.ch 

More information about the Bioconductor mailing list