[BioC] package/function for median center & unit variance

Hooiveld, Guido Guido.Hooiveld at wur.nl
Tue Jul 12 11:23:34 CEST 2011


Hi Steve and James,

Thanks for pointing me to the proper direction.

For the archive (in case someone has the same question):

First normalize:
affy.data <- ReadAffy()
x.norm <- rma(affy.data)
y <- exprs(x.norm)

To mean center and scale the (normalized) dataset:
yscaled <- t(scale(t(y)))

This returns a dataset (yscaled) which is *mean* centered and has a standard deviation of one (unit variance normalized; hear *mean*=0, SD=1)).
Source: excellent site of Dr Girke:
http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#clustering_prepro


For *median* centering + scaling:

#1st: perform median centering:
y <- apply(y,1,function(
 x){
    x-median(x)
 })
#Note: 1=row, 2=column

#2nd: unit variance normalization (here *median*=0, SD=1)
yscaled <- t(scale(t(y),center=FALSE))

Regards,
Guido



--------------------------------------------------------- 
Guido Hooiveld, PhD 
Nutrition, Metabolism & Genomics Group 
Division of Human Nutrition 
Wageningen University 
Biotechnion, Bomenweg 2 
NL-6703 HD Wageningen 
the Netherlands 
tel: (+)31 317 485788 
fax: (+)31 317 483342 
email:      guido.hooiveld at wur.nl
internet:   http://nutrigene.4t.com 
http://www.researcherid.com/rid/F-4912-2010


-----Original Message-----
From: Steve Lianoglou [mailto:mailinglist.honeypot at gmail.com] 
Sent: Monday, July 11, 2011 15:25
To: Hooiveld, Guido
Cc: bioconductor (bioconductor at stat.math.ethz.ch)
Subject: Re: [BioC] package/function for median center & unit variance

Hi Guido,

On Mon, Jul 11, 2011 at 9:17 AM, Hooiveld, Guido <Guido.Hooiveld at wur.nl> wrote:
> Dear list,
>
> After (RMA) normalization I would like to post-process my array data for downstream analyses by means of median centering and/or unit variance normalization.
> Does anyone know a library that contains these functions?
> Using available functions would minimize the chance on errors due my limited R coding skills... ;) I had a look at the library Genefilter but it doesn't contain these functions.

The base `scale` function will get yo close to where you want to be.
It works on a matrix, so you'll have to get your `exprs` matrix out of your RMA normalized data.

By default, `scale` actually does mean-centering and then divides the columns by their std.dev. The help page for `scale` will also point you to `sweep` which has an example of how to median center the columns of a matrix.

HTH,

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list