[BioC] Potential bug when finding baseline array in normalize.AffyBatch.invariantset

Thu Aug 4 21:29:57 CEST 2011

Hi Michael,

Thanks for the clear bug report. Fixed now in devel version. This will 
propagate over the next few days to the download repositories, but you 
will need the devel version of R to get it via biocLite().

Best,

Jim

On 8/3/2011 11:47 AM, michael mayhew wrote:
>   Hello all,
>
> I've looked at the code in the most recent version of the affy package, and
> I've corresponded with Dr. Laurent Gauthier, the listed author of the
> function in the R docs.
>
> I've been using the normalize.AffyBatch.invariantset function (via a call to
> expresso) and noticed something a little peculiar with how the baseline
> array is found.
>
>     Here is a snippet of the code where baseline.type="mean" (the code is the
> same for baseline.type="median"):
>
>             if (baseline.type == "mean") {
>              m<- vector("numeric", length = nc)
>              for (i in 1:nc) m[i]<- mean(intensity(abatch)[pms,
>                  i])
>              refindex<- trunc(median(rank(m)))
>               rm(m)
>              baseline.chip<- c(intensity(abatch)[pms, refindex])
>              if (verbose)
>                  cat("Data from", sampleNames(abatch)[refindex],
>                    "used as baseline.\n")
>        }
>
>     In red, I've highlighted the parts I'm interested in. What this code
> seems to do is always return the array in the middle of the order in which
> the array data were given to make the AffyBatch object.
>
>     However, it seems like you would want to find the index of the array that
> corresponds to the middle intensity (which is not refindex). I think it
> should be something like:
>
>        refindex<- match(trunc(median(rank(m))),rank(m))
>
>     My colleagues and I noticed this when we entered our microarray data
> (using the ReadAffy(widget=TRUE) command) in two different orders. We then
> ended up with two different baseline arrays which didn't make much sense. It
> seemed to us that we should get the same baseline array regardless of the
> order in which the data were entered. The baseline array seemed like it
> should be chosen based on the expression values on the array.
>
>      Thanks again for your time, and I look forward to hearing your response.
>
> Michael Mayhew
> Graduate Student
> Program in Computational Biology&  Bioinformatics
> Duke University
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues