[BioC] removing outlier/masked probes and gcrma

Andrew Su asu at gnf.org
Mon Jan 15 20:58:56 CET 2007


Thanks, Jim, for the thoughts below.  Unfortunately, we are using a
custom Affy chip design here, so those precomputed ones won't work for
us.  But we could certainly create a custom CDF file for our chip type
too...

... but it surprises me somewhat that there isn't an alternate solution.
First, what do people do with an AffyBatch object which was read in
using the rm.mask option if it can't be used for further analyses?  (Or
is this a failing in how gcrma specifically deals with NAs?)  And
second, although custom CDFs would be great for dealing with
ChipType-specific effects (e.g., SNPs), how do people deal with
chip-specific effects (e.g., scratches and debris)?  Just a couple of
thoughts...  Any additional ideas are welcome, but we'll be pushing
ahead on custom CDFs in the mean time...

Cheers,
-andrew 



-----Original Message-----
From: James W. MacDonald [mailto:jmacdon at med.umich.edu] 
Sent: Saturday, January 13, 2007 6:41 AM
To: Andrew Su
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] removing outlier/masked probes and gcrma

Hi Andrew,

Andrew Su wrote:
> I am attempting to use gcrma on AffyBatch objects which were read in
> using the "rm.outliers=TRUE" or "rm.mask=TRUE" options (to the
ReadAffy
> function).  For example, I put two MOE430 CEL files in the working
> directory, and here is what I tried:
> 
>  
> 
> 
>>ab<-ReadAffy(filenames=list.celfiles(),rm.outliers=TRUE)
> 
> 
>>ai<-compute.affinities(cdfName(ab))
> 
> 
> .> data<-gcrma(ab,ai)       
> 
> Adjusting for optical effect..Done.
> 
> Adjusting for non-specific binding.Error in
> gcrma.bg.transformation.fast(pms, bhat, var.y, k = k) : 
> 
>         NAs are not allowed in subscripted assignments

As you can see, you cannot have any NAs in your data to use gcrma. An 
alternative to this is to use the MBNI cdf/probe packages that have the 
probes with SNPs in the central 15 base pairs removed. Anything in this 
listing with SNP in the name has these probes removed.

http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF_d
ownload_v6.asp

Note that there are some downsides to using these cdfs, mainly that the 
standard errors of your estimates will be highly variable, since the 
probesets for these cdfs are quite variable in size (unlike the stock 
affy chip, where the vast majority have 11 probes).

Best,

Jim


> 
> 
>>sessionInfo()
> 
> 
> Version 2.3.1 (2006-06-01) 
> 
> i386-pc-mingw32 
> 
>  
> 
> attached base packages:
> 
> [1] "splines"   "tools"     "methods"   "stats"     "graphics"
> "grDevices"
> 
> [7] "utils"     "datasets"  "base"     
> 
>  
> 
> other attached packages:
> 
> mouse4302probe   mouse4302cdf          gcrma    matchprobes
> affy 
> 
>       "1.10.0"       "1.10.0"        "2.6.0"        "1.4.0"
> "1.12.2" 
> 
>         affyio        Biobase 
> 
>        "1.0.0"       "1.10.1"
> 
>  
> 
>  
> 
> I have tried using both R versions 2.3.1 and 2.1.0, and gcrma versions
> 1.1.4 and 2.6.0, and affy versions 1.12.2 and 1.10.0.  I get a similar
> error when using the rm.mask=TRUE option.  
> 
>  
> 
> My overall goal is to remove select probes from the analysis (in this
> case, probes that overlap known polymorphisms).  Any thoughts on how
> best to do this are most appreciated...
> 
>  
> 
> Cheers,
> 
> -andrew 
> 
>  
> 
> --
> 
> Andrew Su, Ph.D.
> 
> Genomics Institute of the 
> 
>   Novartis Research Foundation
> 
> asu at gnf.org
> 
> Tel: 858-812-1656
> 
> Fax: 858-812-1630
> 
> http://web.gnf.org
> 
>  
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
James W. MacDonald
University of Michigan
Affymetrix and cDNA Microarray Core
1500 E Medical Center Drive
Ann Arbor MI 48109
734-647-5623



**********************************************************
Electronic Mail is not secure, may not be read every day, and should not
be used for urgent or sensitive issues.



More information about the Bioconductor mailing list