[BioC] option rm.mask=TRUE in ReadAffy

Laurent Gautier laurent at cbs.dtu.dk
Fri Oct 31 12:15:52 MET 2003


This is a nice way to solve the problem. 

I did not suspect many were suffering from that.
In my particular case I skip the bg.correct step quite often
(and this appears to be the trouble. Most of the
other preprocessing steps seem to behave (see my previous
mail in this thread). Note that Ben Bosltad
recalled (earlier in this very same thread)
that the bgcorrect step tries to "correct" the signal
for the probes. Masking probes away removes a part of the need
for background correction (in my humble opinion, for the
Affymetrix chips, and for many of these I chips I had; this
might not be good be everyone)).

The masked (and outliers) found in the CEL should probably be stored
in the annotation/description for easier retrieval (but at AFAIR,
this has been shortly in the devel-version some time ago (or was almost there ;)),
public request will hopefully bring it back that).

I might have an other way to solve the problem:
(I have never seen '.MSK' files. The 'masked probes' I know of
are included in the CEL files. The CEL files also contain
a section 'outliers').


## have the file names in filenames. For example :
filenames <- list.files("where/I/want", "CEL", full.names=TRUE)

abatch <- ReadAffy()

## compressed CEL files ?
compress <- FALSE

ids.list <- vector("list", length=length(filenames))

for (i in seq(along=filenames)) {
  file <- filenames[i]
  masked.xy <- .Call("getIndexExtraFromCEL", as.character(file),
                     as.character("MASKS"),
                     as.integer(compress))
  masked.i <- xy2indices(masked.xy[,1], masked.xy[, 2], abatch=abatch) 
  outliers.xy <- .Call("getIndexExtraFromCEL", as.character(file),
                        as.character("OUTLIERS"),
                        as.integer(compress))
  outliers.i <- xy2indices(outliers.xy[, 1], outliers.xy[, 2], abatch=abatch)
  ids.list[[i]] <- rbind(masked.i, outliers.i) 
} 

## then whenever ones judges the time has come to mask the outliers:

for (i in seq(along=filenames)) {
  intensity(abatch)[ids[[i]], i] <- NA
}




Best,



L.



 
On Fri, Oct 31, 2003 at 07:08:03PM +0900, t-kawai at hhc.eisai.co.jp wrote:
> I have the same problem with expresso() as weiss.
> 
> To solve this problem, I executed a series of procedures step by step.
> 
> Background correction, I think, needs all of the CEL data wether those are
> to be masked or not.  From the normalization step, I removed masking data
> using Affy's original mask file.
> 
> For example,
> 
> ## read CEL files
> dat0 <- ReadAffy();
> 
> ## Normalization step
> dat <- bg.correct.mas(dat0);
> 
> ## read mask file (in this case, 1803 probes to be masked)
> msk <- scan("Hu6800_ClassA.MSK", skip=2, list("", ""));
> 
> ## set ids (cell index list to be masked)
> ## (in this case, 59540 cells will be masked)
> for (i in 1:length(msk[[1]])) {
>         nam <- msk[[1]][i];
>         txt <- gsub("-", ":", msk[[2]][i]);
>         lst <- eval(parse(text=paste("c(", txt, ")")));
> 
>         if (i == 1) {
>                 ids <- pmindex(dat, nam)[[1]][lst];
>         } else {
>                 ids <- c(ids, pmindex(dat, nam)[[1]][lst]);
>         }
>         ids <- c(ids, mmindex(dat, nam)[[1]][lst]);
> }
> 
> ## set NAs to cells to be maksed
> intensity(dat)[ids, ] <- NA;
> 
> ## Normalization step
> dat1 <- normalize.AffyBatch.qspline(dat);
> 
> ## Probe correction & summary step
> dat2 <- computeExprSet(dat1, pmcorrect="mas", summary.method="liwong");
> write.exprs(dat2, file="result.lst");
> 
> 
> 
> The file "Hu6800_ClassA.MSK" looks like
> Hu6800
> [Call]
> A28102_at       17,18,19,20
> AB000381_s_at   7
> AC000064_cds2_at        18,19,20
> ...
> M81830_at       1-20
> M83181_at       1-20
> ...
> 
> 
> Above script works rightly?  Please give me a comment...
> 
> Kawai
> 
> _______________________________________
> 
> Takatoshi Kawai, Ph.D.
> 
> Senior Sientist, Bioinformatics
> Laboratoy of Seeds Finding Technology
> Eisai Co., Ltd.
> 5-1-3 Tokodai, Tsukuba-shi,
> Ibaraki 300-2635, Japan
> 
> TEL: +81-29-847-7192
> FAX: +81-29-847-7614
> e-mail: t-kawai at hhc.eisai.co.jp
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

-- 
--------------------------------------------------------------
Laurent Gautier			CBS, Building 208, DTU
PhD. Student			DK-2800 Lyngby,Denmark	
tel: +45 45 25 24 89		http://www.cbs.dtu.dk/laurent



More information about the Bioconductor mailing list