[BioC] normalization for data in a matrix

Mon May 12 15:33:08 CEST 2008

On Mon, May 12, 2008 at 8:54 AM, Federico Abascal <fabascal at cnb.csic.es> wrote:
> Hello,
>
>  I have been reading about methods of normalization for microarrays
>  intensities and found diverse bionconductor packages (affy, limma,
>  aroma) implementing different approaches. I am still lost with so many
>  information. To test the different methods I would like to use my own
>  matrix of intensities (I have no other information than intensities). My
>  question is, which is the best (or easiest) way to test the different
>  normalization methods with a matrix as the one I have?
>
>  In addition, I would like to know if normalizing by dividing the
>  intensity of each gene by the sum of intensities in the corresponding
>  array ("Total Intensity Normalization") is a bad approach?

There are much better ones.  See the affy and vsn packages as
potential candidates.  RMA (available via the affy package) is pretty
standard.

>  I found that genes around particular rows (e.g row 2000, row 10000, etc)
>  tend to have greater intensities accross all of the samples (the array
>  is Affymetrix GeneChip Human Genome U133 Plus 2.0 Array). An example of
>  this for a particular hybridization can be found at:
>  http://biocomp.cnb.uam.es/~fabascal/tmp.png
>  At least most of these peaks are related to ribosomal proteins, what
>  explains the increased expression. What I do not understand is why those
>  increased measures tend to appear together, in blocks. Could be this an
>  artifact? Does it require a correction? How could it be corrected? These
>  might be too many questions, sorry!

You will want to do quality control of your arrays as part of the
preprocessing.  There are numerous packages for affy QC, but see
simpleaffy as a start, perhaps.  As for intensities of any individual
gene, they are "arbitrary".  In other words, the scale of one gene
cannot be compared to another.

Hope that helps.

Sean