[BioC] can't get summary.stat after "invariantset" normalizat ion

Laurent Gautier laurent@cbs.dtu.dk
Sun, 20 Oct 2002 19:35:42 +0200


Without having your particular dataset, one can only conjecture.


L.



On Fri, Oct 18, 2002 at 04:21:48PM -0400, Man, Michael wrote:
> Thanks for the help.  The problem is still there.  Here are the results ...
> -m
> 
> > tmp
> Plob object
> CDF used=HG_U95A.CDF
> number of chips=8
> number of genes=12626
> number of probe pairs=201807
> annotation=
> description=
> notes=normalized by invariant set
> size of chip=640x640
> chip names=
> 1532m99hpp_av04
> 1532n99hpp_av04
> 1532o99hpp_av04
> 1532p99hpp_av04
> 1532q99hpp_av04
> 1532r99hpp_av04
> 1532s99hpp_av04
> 1532t99hpp_av04r
> 
> 
> > ##### patch suggested by Laurent Gautier 10/17/2002 #####
> > bg.parameters  <- function (pm, mm, n.pts = 2^14)
> + {
> +     max.density <- function(x, n.pts) {
>   +         aux <- density(x, kernel = "epanechnikov", n = n.pts,
> na.rm=TRUE)
> +         aux$x[order(-aux$y)[1]]
>   +     }
>   +     mmbg <- max.density(mm, n.pts)
>   +     pmbg <- max.density(pm, n.pts)
>   +     bg.data <- mm[mm < mmbg]
>   +     bg.data <- bg.data - mmbg
>   +     bgsd <- sqrt(sum(bg.data^2)/(length(bg.data) - 1)) * sqrt(2)/0.85
>   +     sig.data <- pm[pm > pmbg]
>     +     sig.data <- sig.data - pmbg
>    +     alpha <- 1/mean(sig.data)
>    +     mubg <- mmbg
>   +     list(alpha = alpha, mu = mubg, sigma = bgsd)
> }
> #+ }
> ## end > ##### end of the patch #####
> 
> 
> > system.time(
> +   latin.LiWong <- express(tmp, normalize=FALSE, summary.stat=li.wong) #
> didn't work
> + )
> Background correcting
> Preparing Data
> Computing expression. This may take a while.
> Error in while (change > delta & iter < maxit) { : 
>         missing value where logical needed
> Timing stopped at: 31.31 0.62 31.94 0 0 
> 
> 
> > system.time( latin.avdf <-
> express(tmp,normalize=FALSE,bg=subtractmm,summary.stat=avdiff) )
> Background correcting
> Preparing Data
> Computing expression. This may take a while.
> Error in var(x, na.rm = na.rm) : missing observations in cov/cor
> Timing stopped at: 13.78 0.55 14.33 0 0 
> 
> 
> > latin.avdf <- express(tmp,normalize=F,bg=subtractmm,
> summary.stat=function(x) apply(x,2,mean,trim=.2))
> Background correcting
> Preparing Data
> Computing expression. This may take a while.
> Error in psort(x, partial) : index 13 outside bounds
> > traceback()
> 9: sort(x, partial = unique(c(lo, hi)))
> 8: mean.default(newX[, i], ...)
> 7: FUN(newX[, i], ...)
> 6: apply(x, 2, mean, trim = 0.2)
> 5: FUN(X[[1280]], ...)
> 4: lapply(as.list(X), FUN, ...)
> 3: sapply(data.lst, summary.stat, ...)
> 2: t(sapply(data.lst, summary.stat, ...))
> 1: express(tmp, normalize = F, bg = subtractmm, summary.stat = function(x)
> apply(x, 
>        2, mean, trim = 0.2))
> 
> 
> > library(help=affy)
> affy            Methods for Affymetrix Oligonucleotide Arrays
> 
> Description:
> 
> Package: affy
> Title: Methods for Affymetrix Oligonucleotide Arrays
> Version: 1.0
> Author: Rafael A. Irizarry <rafa@jhu.edu>, Laurent Gautier
>         <laurent@cbs.dtu.dk>, and Leslie M. Cope <cope@mts.jhu.edu>
> Description: The package contains functions for exploratory
>         oligonucleotide array analysis.
> Maintainer: Rafael A. Irizarry <rafa@jhu.edu>
> Depends: R (>= 1.5.0), modreg, Biobase, eda, tkWidgets
> Bundle: Bioconductor
> Date: 2002/5/1
> BundleDescription: Contains essential Bioconductor packages.
> URL: http://www.bioconductor.org/
> License: LGPL
> Built: R 1.6.0; i686-pc-linux-gnu; Wed Oct 16 17:03:17 EDT 2002
> 
> 
> 
> -----Original Message-----
> From: Laurent Gautier [mailto:laurent@cbs.dtu.dk]
> Sent: Thursday, October 17, 2002 5:34 PM
> To: Man, Michael
> Cc: 'Laurent Gautier'; 'bioconductor@stat.math.ethz.ch'
> Subject: Re: [BioC] can't get summary.stat after "invariantset"
> normalizat ion
> 
> 
> Well... seems I've been boasting too much (and I was secretely hoping
> it was the devel version...). I do not have run affy-release on 1.6.0,
> I can only suggests reasons.
> 
> > On Thu, Oct 17, 2002 at 11:24:24AM -0400, Man, Michael wrote:
> > > Any idea?  BTW, it works fine when I used unnormalized data.  -michael
> 
> This is the funny part. I wonder why it does 
> > 
> > 
> > Several even... which version of R and of the package are you using ?
> > 
> > 
> > L.
> > 
> > 
> > > 
> > > ###############################
> > > > tmp <- normalize.Plob.invariantset(latin)  # use the first CHIP as
> > > reference
> 
> The following do it too (preferred way):
> 
> tmp <- normalize(latin, method="invariantset")
> 
> 
> > > > latin.LiWong <- express(tmp, normalize=F, bg=subtractmm,
> > > summary.stat=li.wong)
> > > Background correcting
> > > Preparing Data
> > > Computing expression. This may take a while.
> > > Error in while (change > delta & iter < maxit) { : 
> > >         missing value where logical needed
> 
> Would you have variables called 'T' or 'F' around ?
> (the code relied (too much) on (T == TRUE) and (F == FALSE) but this
> was fixed about a week ago I believe). 
> 
> > > In addition: Warning message: 
> > > No convergence in inner loop after 50 in outerler tieration 4 
> > >  in: fit.li.wong(t(data.matrix), remove.outliers, normal.array.quantile,
> 
> > > 
> 
> This only a warning.
> 
> > > 
> > > > latin.LiWong <- express(tmp, normalize=F, summary.stat=li.wong)
> > > Background correcting
> > > Error in density(x, kernel = "epanechnikov", n = n.pts) : 
> > >         x contains missing values
> > >
> 
> In this case the default 'bg' is rma. You might get 'masked' probes from
> your CEL files (to avoid taking them into consideration, try to set the
> flags related to masked and outlying probes to FALSE (see man related
> man page)). I thought it was fixed... but may be only in affy-devel.
> Try to patch bg.parameters with the following:
> 
> bg.parameters <- function(pm, mm, n.pts=2^14){
> 
>   max.density <- function(x,n.pts){
>     aux <- density(x, kernel="epanechnikov", n=n.pts, na.rm=TRUE) 
>     aux$x[order(-aux$y)[1]] 
>   }
>     mmbg <- max.density(mm,n.pts)
>     pmbg <- max.density(pm,n.pts)
> 		   
>     bg.data <- mm[mm < mmbg]
>     bg.data <- bg.data-mmbg
>     bgsd <- sqrt(sum(bg.data^2)/(length(bg.data)-1))*sqrt(2)/.85
> 			  
>     sig.data <- pm[pm > pmbg]
>     sig.data <- sig.data-pmbg
> 			       
>     alpha <- 1/mean(sig.data)
>     mubg <- mmbg
>     list(alpha=alpha,mu=mubg,sigma=bgsd)  
> }
> 				     
> 
> > > 
> > > > latin.avdf <-
> express(tmp,normalize=F,bg=subtractmm,summary.stat=avdiff)
> > > Background correcting
> > > Preparing Data
> > > Computing expression. This may take a while.
> > > Error in var(x, na.rm = na.rm) : missing observations in cov/cor
> > > 
> 
> A patch was contributed for that and I was convinced I applied it to
> both the released and devel version (I cannot find track of it in CVS ?!
> I am (much) confused now...)
> The reason is like above: presence of NAs.
> 
> > > 
> > > > latin.avdf <- express(tmp,normalize=F,bg=subtractmm,
> > > +                       summary.stat=function(x)
> apply(x,2,mean,trim=.2))
> > > Background correcting
> > > Preparing Data
> > > Computing expression. This may take a while.
> > > Error in psort(x, partial) : index 13 outside bounds
> 
> 
> What does give 'traceback()' on this one ?
> 
> 
> 
> 
> L.
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> http://www.stat.math.ethz.ch/mailman/listinfo/bioconductor