[BioC] How do I set the median when performing MAS5.0 normalization

James MacDonald jmacdon at med.umich.edu
Tue Dec 7 00:54:14 CET 2010


Hi Matthew,

No, the sc parameter scales the data using a trimmed mean rather than the median. You can see that by inspecting first the code for mas5():

 mas5 <- function (object, normalize = TRUE, sc = 500, analysis = "absolute", 
    ...) 
{
    res <- expresso(object, bgcorrect.method = "mas", pmcorrect.method = "mas", 
        normalize = FALSE, summary.method = "mas", ...)
    if (normalize) 
        res <- affy.scalevalue.exprSet(res, sc = sc, analysis = analysis)
    return(res)
}

and then by looking at the code for affy.scalevalue.exprSet():

affy.scalevalue.exprSet <- function (eset, sc = 500, analysis = "absolute") 
{
    analysis <- match(analysis, c("absolute", "comparison"))
    if (analysis == 1) 
        nf <- 1
    else stop("sorry! comparison not implemented.")
    for (i in 1:ncol(exprs(eset))) {
        slg <- exprs(eset)[, i]
        sf <- sc/mean(slg, trim = 0.02)
        reported.value <- nf * sf * slg
        exprs(eset)[, i] <- reported.value
    }
    return(eset)
}

So this just centers things on a 2% trimmed mean. You _could_ just hijack this function and set the trim to 0.5 to do the median scaling, but it is probably easier to 'roll yer own'.

Best,

Jim



James W. MacDonald, M.S.
Biostatistician
Douglas Lab
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
>>> Matthew Willmann  12/06/10 5:53 PM >>>
Hi Jim,

Thank you.  Does this mean that I can obtain the desired normalization by the following?

mas5(data.MRW, normalize=FALSE, sc=50)


Matthew
-----------------------------------------------------
Matthew R. Willmann, Ph.D.
Research Associate, Poethig Lab
University of Pennsylvania
Department of Biology
433 S. University Avenue
Philadelphia, PA 19104
Lab phone: 215-898-8916
Cell: 508-243-2495
Fax: 215-898-8780

On Dec 6, 2010, at 3:55 PM, James W. MacDonald wrote:

> Hi Matthew,
> 
> On 12/6/2010 11:16 AM, Matthew Willmann wrote:
>> Hello,
>> 
>> I am trying to replicate results in a published paper, where they performed a MAS5.0 normalization with the median normalized to 50 and then defined genes as present if the normalized expression value is higher than 30.  Does anyone know how I can alter the standard code to establish these parameters?
>> 
>>> Cel.files=list.files(pattern=".CEL")
>>> data.MRW=ReadAffy(filenames=Cel.files)
>>> mas5(data.MRW, normalize=TRUE, sc=500)
> 
> From the help page for mas5():
> 
> Usage:
> 
>     mas5(object, normalize = TRUE, sc = 500, analysis = "absolute", ...)
> 
> Arguments:
> 
>  object: an instance of 'AffyBatch'
> 
> normalize: logical. If 'TRUE' scale normalization is used after we
>          obtain an instance of 'ExpressionSet'
> 
> So if you use mas5() with normalize = FALSE, then you don't get any scale normalization. You can then simply scale each chip to have a median of 50.
> 
> Best,
> 
> Jim
> 
> 
>> 
>> Thank you for your time and advice.
>> 
>> 
>> Matthew
>> 
>> 
>> 
>> -----------------------------------------------------
>> Matthew R. Willmann, Ph.D.
>> Research Associate, Poethig Lab
>> University of Pennsylvania
>> Department of Biology
>> 433 S. University Avenue
>> Philadelphia, PA 19104
>> Lab phone: 215-898-8916
>> Cell: 508-243-2495
>> Fax: 215-898-8780
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 
> 


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list