[BioC] Help Regarding Analysis.

Wed Jun 2 03:32:40 CEST 2004

Hello All,

Our lab is new to Microarray analysis and we are in the learning stage
of the entire process. We have some CEL files from 4 different Labs. One
Lab works on RAT data and we have CEL files from 19 chips of RAE230A.
Another lab works on MICE data using MU11K. Another works on MICE Data
using some chip we don't know as yet and one more lab works on HUMAN
data using HU6800. The main aim to compare some common genes between
these species for a specific purpose. 

The following is the stage we have reached in analyzing the CEL Files
for each lab.  (This is picked from one of the read me file but I guess
gives a very good understanding of where we have reached.)

What to do when you get the CEL Files?
==============================================

The processing of probe level data into Gene Expression measures is done
by a 4 step process of
1. Background Correction
2. Normalization
3. PM Correction and
4. Summary Expression value computation.

This is done in the software package called R which has the AFFY Package
Installed from bioconductor.org.

After launching R, following commands were typed in it.
-------------------------------------------------------

# Loads the Affy Library Files
>library(affy)

# read the data using affybatch, Chip-01.CEL etc are the CEL files for
the Finch Experiment.
# We had 19 chips in all in this experiment. Chip used RAE230A,B
> data <-
read.affybatch("Chip-01.CEL","Chip-02.CEL","Chip-03.CEL","Chip-04.CEL","
Chip-05.CEL","Chip-06.CEL","Chip-07.CEL","Chip-08.CEL","Chip-09.CEL","Ch
ip-10.CEL","Chip-11.CEL","Chip-12.CEL","Chip-13.CEL","Chip-14.CEL","Chip
-15.CEL","Chip-16.CEL","Chip-17.CEL","Chip-18.CEL","Chip-19.CEL")

# Run the Background correction, normalization etc.
# (These were selected with some recommendations from Diana Abudeva and
some from articles/papers on the internet.
# 1. Bg Correct Using "RMA2" Method by Irizarry et.al
# 2. Normalization Using Constant "Constant"
# 3. Pm Correction with "PMonly" (PM - Intensity values read for Perfect
Matches)
# 4. Summary Using "LiWong" i.e DChip.
# The Output of all this stored in the affy.es object.
>affy.es <- expresso(data, bgcorrect.method="rma2",
normalize.method="constant", pmcorrect.method="pmonly",
summary.method="liwong")

# Write The expression values in a File
> write.exprs(affy.es, file="liwong_normalize-05192004.csv")

We then take the CSV file and open it in MS Excel installed with SAM
(Significant Analysis of Microarrays).
Thru' the output of SAM we have some significant genes. But comparing
the results with our theoretical analysis, we were not happy with it. So
some one then suggested us to use T Tests using R itself.

So these are my questions regarding the analysis:

1. Are the Background Correction, normalization, Summary and PM
correction Methods good enough? If not, what sort of combination would
you suggest considering that, data from different specifies will be
compared ultimately?

2. How exactly would one go about the T Test in R (affy) with 0.5 as the
cutoff value? I am not sure of what commands to apply to the affy.es
object for the t test.

Thanking you. 

Regards,

Ashay Manjure
University of Southern California.