[BioC] LPE error caused by gcRMA [Error invar.M.adap[i] <- ifelse(!is.na(var.M.adap[i - 1]), mean(var.M.adap[i + : replacement has length zero]

Tue Nov 4 17:21:10 CET 2008

04/11/2008 12:48 Richard Pearson scripsit
> Charlie
> 
> Following on from what Guido has said, you could test your hypothesis
> that the identical values produced by the newer version of gcrma are the
> problem by adding a small amount of random noise to the gcrma expression
> values and repeating the LPE analysis. I'm not familiar with LPE myself,
> but have seen similar problems due to the identical values produced by
> gcrma with other methods.
> 

Wait a minute ... the raw intensities of one probe vary randomly across
arrays. Most normalisation methods remove some of this variation (the
systematic part), while some remains. Now we have a method that -for
some probes- totally removes this variation, and provides identical
estimates across arrays.

Simply adding artificial noise (from which distribution?) won't solve
any real problem.

Best wishes
 Wolfgang

------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

> Best wishes
> 
> Richard.
> 
> Hooiveld, Guido wrote:
>> Hi,
>>
>> Just to confirm that I and others also observed that since recently
>> GCRMA (>v 2.12) produces identical values for many probesets.
>> See e.g.:
>> http://thread.gmane.org/gmane.science.biology.informatics.conductor/1884
>> 4/focus=18914
>> This behaviour is related to a modification in how Gene Specific
>> Background (GSB) is handled in GCRMA v2.12, compared to previous
>> versions.
>>
>> G
>>> -----Original Message-----
>>> From: bioconductor-bounces at stat.math.ethz.ch
>>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of charliew
>>> Sent: 04 November 2008 12:12
>>> To: Wolfgang Huber
>>> Cc: bioconductor at stat.math.ethz.ch
>>> Subject: Re: [BioC] LPE error caused by gcRMA [Error invar.M.adap[i]
>>> <- ifelse(!is.na(var.M.adap[i - 1]),mean(var.M.adap[i + : replacement
>>> has length zero]
>>>
>>> Hi Wolfgang,
>>>
>>>> (i) do you (and others) consider this an "error", or rather "bad
>>>> behaviour" or "poor performance"?
>>> I'd have to say it is an error because I feel like it should work but
>>> it doesn't. Although it is also bad behaviour and poor performance.
>>>
>>>> (ii) and is it gcrma, or LPE that errs or poorly performs?
>>> I don't really know. Both packages can work fine.
>>>
>>> LPE works great for RMA data or any other data matrices from
>>> different array platforms. It just does not work on any gcRMA data
>>> that I have tried.
>>>
>>> gcRMA produces very reasonable summarized data so it seems to work
>>> fine too.
>>>
>>> things break down when I try to take gcRMA data to LPE. I feel like
>>> it has something to do with rows that have identical values in all
>>> lanes.
>>> I think that might be a newer "feature" of gcRMA but I'm not sure.
>>>
>>> The context of this question is I have a batch of old array data that
>>> is being prepped for publication.
>>> Back in early 2006 I identified a set of potentially differentially
>>> expressed genes by summarizing the data with gcRMA, then diff testing
>>> with LPE.
>>>
>>> Unfortunately I didn't note versions of software and whatnot. Totally
>>> my fault.
>>>
>>> To gather the information I needed to write a good methods section, I
>>> wanted to repeat the analysis right now and more carefully document
>>> what I did.
>>> Trouble is, I get this error out of LPE and that is making it hard to
>>> exactly duplicate the old results.
>>>
>>> One thing I know for sure is the old and new gcRMA data are not
>>> identical so that lead me to think that a gcRMA change is the source
>>> of the problem.
>>>
>>>> (iii) and are any of the maintainers of these packages 
>>> interested in
>>>> these questions?
>>> hopefully...
>>>
>>>> Best wishes
>>>> Wolfgang
>>>>
>>>> ------------------------------------------------------------------
>>>> Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
>>>>
>>>>
>>>> 31/10/2008 20:05 charliew scripsit
>>>>> Hi Patrick,
>>>>> Thanks a lot for the quick reply. I updated the package 
>>> and it didn't
>>>>> fix the error.
>>>>>
>>>>> c
>>>>> On Oct 31, 2008, at 3:29 PM, Patrick Aboyoun wrote:
>>>>>
>>>>>> Charlie,
>>>>>> I don't know if this is related to you issue, but a bug 
>>> in the gcrma
>>>>>> package was just fixed and there is a version 2.14.1 is now up on
>>>>>> bioconductor.org. Update to the latest version of gcrma 
>>> and see if
>>>>>> it addresses your issue.
>>>>>>
>>>>>>
>>>>>> Patrick
>>>>>>
>>>>>>
>>>>>>
>>>>>> charliew wrote:
>>>>>>> Dear List,
>>>>>>> I've encountered the following error when running LPE:
>>>>>>>
>>>>>>> Error in var.M.adap[i] <- ifelse(!is.na(var.M.adap[i - 1]),
>>>>>>> mean(var.M.adap[i +  :
>>>>>>> replacement has length zero
>>>>>>>
>>>>>>> It happens when the CEL files have been processed with gcRMA but
>>>>>>> not when they have been processed with RMA.
>>>>>>> I'm not positive about this but I think this error first started
>>>>>>> happening with the upgrade to gcRMA 2.x I think it is happening
>>>>>>> because gcRMA is producing a lot of probes with identical
>>>>>>> expression values.
>>>>>>>
>>>>>>> Here is a test session that causes the error. Upon request I can
>>>>>>> provide a tarball of the test data but any collection of 
>>> CEL files
>>>>>>> will reproduce the error.
>>>>>>> The error also occurs if you run gcRMA from within 
>>> onecolorGUI or
>>>>>>> affylmGUI.
>>>>>>> It also happens if you first write the expression data to a file
>>>>>>> with write.exprs, then read it back in with read.table.
>>>>>>>
>>>>>>> #Loading the packages
>>>>>>>> library(affy)
>>>>>>> Loading required package: Biobase
>>>>>>> Loading required package: tools
>>>>>>>
>>>>>>>> library(gcrma)
>>>>>>> Loading required package: matchprobes Loading required package:
>>>>>>> splines
>>>>>>>
>>>>>>>> library(LPE)
>>>>>>>> set.seed(0)
>>>>>>> #Reading in 4 CEL files
>>>>>>>> test.Dat<-ReadAffy()
>>>>>>> #Summarizing with gcRMA
>>>>>>>> test.gcrma<-gcrma(test.Dat)
>>>>>>> Adjusting for non-specific binding....Done.
>>>>>>> Normalizing
>>>>>>> Calculating Expression
>>>>>>>
>>>>>>> #Summarizing with RMA
>>>>>>>> test.rma<-rma(test.Dat)
>>>>>>> Background correcting
>>>>>>> Normalizing
>>>>>>> Calculating Expression
>>>>>>>
>>>>>>> #Extracting gcRMA assay data
>>>>>>>> test.gcrma.MAT<-exprs(test.gcrma)
>>>>>>>> dim(test.gcrma.MAT)
>>>>>>> [1] 15611     4
>>>>>>>
>>>>>>> #Extracting RMA assay data
>>>>>>>
>>>>>>>> test.rma.MAT<-exprs(test.rma)
>>>>>>>> dim(test.rma.MAT)
>>>>>>> [1] 15611     4
>>>>>>>
>>>>>>> #Running LPE function on gcRMA data and the resulting error
>>>>>>>> var.test.gcrma<-baseOlig.error(test.gcrma.MAT, q= 0.01)
>>>>>>> Error in var.M.adap[i] <- ifelse(!is.na(var.M.adap[i - 1]),
>>>>>>> mean(var.M.adap[i +  :
>>>>>>> replacement has length zero
>>>>>>>
>>>>>>> #Running LPE function on RMA data - it successfully completes
>>>>>>>> var.test.rma<-baseOlig.error(test.rma.MAT, q= 0.01)
>>>>>>>>
>>>>>>> #My session info
>>>>>>>> sessionInfo()
>>>>>>> R version 2.8.0 (2008-10-20)
>>>>>>> i386-apple-darwin8.11.1
>>>>>>>
>>>>>>> locale:
>>>>>>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>>>>>
>>>>>>> attached base packages:
>>>>>>> [1] splines   tools     stats     graphics  grDevices utils
>>>>>>> datasets  methods   base
>>>>>>>
>>>>>>> other attached packages:
>>>>>>> [1] xenopuslaevisprobe_2.3.0 xenopuslaeviscdf_2.3.0
>>>>>>> LPE_1.16.0               gcrma_2.14.0
>>>>>>> matchprobes_1.14.0       affy_1.20.0              Biobase_2.2.0
>>>>>>>
>>>>>>> loaded via a namespace (and not attached):
>>>>>>> [1] affyio_1.10.0        preprocessCore_1.4.0
>>>>>>>
>>>>>>> Thanks a lot for your help
>>>>>>>
>>>>>>> Charlie
>>>>>>>
>>>>>>>
>>> -------------------------------------------------------------------
>>>>>>> ---------
>>>>>>>
>>>>>>> Charlie Whittaker, Ph.D.
>>>>>>> Bioinformatics and Computing Core Facility The David H. Koch
>>>>>>> Institute for Integrative Cancer Research At MIT
>>>>>>> 77 Mass Ave E18-366
>>>>>>> Cambridge, MA 02139
>>>>>>>
>>>>>>> 617-324-0337
>>>>>>>