[BioC] LPE error caused by gcRMA [Error invar.M.adap[i] <- ifelse(!is.na(var.M.adap[i - 1]), mean(var.M.adap[i + : replacement has length zero]

Richard Pearson richard.pearson at postgrad.manchester.ac.uk
Wed Nov 5 11:57:45 CET 2008


Wolfgang

I agree with you completely. I'm not suggesting adding random noise will remove 
any real problem, I'm just suggesting Charlie could do this to see whether his 
hypothesis about the equal values causing the problem is correct. Personally I 
think adding random noise should be avoided in any real analysis.

Richard.


Wolfgang Huber wrote:
> 
> 
> 04/11/2008 12:48 Richard Pearson scripsit
>> Charlie
>>
>> Following on from what Guido has said, you could test your hypothesis
>> that the identical values produced by the newer version of gcrma are the
>> problem by adding a small amount of random noise to the gcrma expression
>> values and repeating the LPE analysis. I'm not familiar with LPE myself,
>> but have seen similar problems due to the identical values produced by
>> gcrma with other methods.
>>
> 
> 
> Wait a minute ... the raw intensities of one probe vary randomly across
> arrays. Most normalisation methods remove some of this variation (the
> systematic part), while some remains. Now we have a method that -for
> some probes- totally removes this variation, and provides identical
> estimates across arrays.
> 
> Simply adding artificial noise (from which distribution?) won't solve
> any real problem.
> 
> Best wishes
>  Wolfgang
> 
> ------------------------------------------------------------------
> Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
> 
> 
> 
>> Best wishes
>>
>> Richard.
>>
>> Hooiveld, Guido wrote:
>>> Hi,
>>>
>>> Just to confirm that I and others also observed that since recently
>>> GCRMA (>v 2.12) produces identical values for many probesets.
>>> See e.g.:
>>> http://thread.gmane.org/gmane.science.biology.informatics.conductor/1884
>>> 4/focus=18914
>>> This behaviour is related to a modification in how Gene Specific
>>> Background (GSB) is handled in GCRMA v2.12, compared to previous
>>> versions.
>>>
>>> G
>>>> -----Original Message-----
>>>> From: bioconductor-bounces at stat.math.ethz.ch
>>>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of charliew
>>>> Sent: 04 November 2008 12:12
>>>> To: Wolfgang Huber
>>>> Cc: bioconductor at stat.math.ethz.ch
>>>> Subject: Re: [BioC] LPE error caused by gcRMA [Error invar.M.adap[i]
>>>> <- ifelse(!is.na(var.M.adap[i - 1]),mean(var.M.adap[i + : replacement
>>>> has length zero]
>>>>
>>>> Hi Wolfgang,
>>>>
>>>>> (i) do you (and others) consider this an "error", or rather "bad
>>>>> behaviour" or "poor performance"?
>>>> I'd have to say it is an error because I feel like it should work but
>>>> it doesn't. Although it is also bad behaviour and poor performance.
>>>>
>>>>> (ii) and is it gcrma, or LPE that errs or poorly performs?
>>>> I don't really know. Both packages can work fine.
>>>>
>>>> LPE works great for RMA data or any other data matrices from
>>>> different array platforms. It just does not work on any gcRMA data
>>>> that I have tried.
>>>>
>>>> gcRMA produces very reasonable summarized data so it seems to work
>>>> fine too.
>>>>
>>>> things break down when I try to take gcRMA data to LPE. I feel like
>>>> it has something to do with rows that have identical values in all
>>>> lanes.
>>>> I think that might be a newer "feature" of gcRMA but I'm not sure.
>>>>
>>>> The context of this question is I have a batch of old array data that
>>>> is being prepped for publication.
>>>> Back in early 2006 I identified a set of potentially differentially
>>>> expressed genes by summarizing the data with gcRMA, then diff testing
>>>> with LPE.
>>>>
>>>> Unfortunately I didn't note versions of software and whatnot. Totally
>>>> my fault.
>>>>
>>>> To gather the information I needed to write a good methods section, I
>>>> wanted to repeat the analysis right now and more carefully document
>>>> what I did.
>>>> Trouble is, I get this error out of LPE and that is making it hard to
>>>> exactly duplicate the old results.
>>>>
>>>> One thing I know for sure is the old and new gcRMA data are not
>>>> identical so that lead me to think that a gcRMA change is the source
>>>> of the problem.
>>>>
>>>>> (iii) and are any of the maintainers of these packages 
>>>> interested in
>>>>> these questions?
>>>> hopefully...
>>>>
>>>>> Best wishes
>>>>> Wolfgang
>>>>>
>>>>> ------------------------------------------------------------------
>>>>> Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
>>>>>
>>>>>
>>>>> 31/10/2008 20:05 charliew scripsit
>>>>>> Hi Patrick,
>>>>>> Thanks a lot for the quick reply. I updated the package 
>>>> and it didn't
>>>>>> fix the error.
>>>>>>
>>>>>> c
>>>>>> On Oct 31, 2008, at 3:29 PM, Patrick Aboyoun wrote:
>>>>>>
>>>>>>> Charlie,
>>>>>>> I don't know if this is related to you issue, but a bug 
>>>> in the gcrma
>>>>>>> package was just fixed and there is a version 2.14.1 is now up on
>>>>>>> bioconductor.org. Update to the latest version of gcrma 
>>>> and see if
>>>>>>> it addresses your issue.
>>>>>>>
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> charliew wrote:
>>>>>>>> Dear List,
>>>>>>>> I've encountered the following error when running LPE:
>>>>>>>>
>>>>>>>> Error in var.M.adap[i] <- ifelse(!is.na(var.M.adap[i - 1]),
>>>>>>>> mean(var.M.adap[i +  :
>>>>>>>> replacement has length zero
>>>>>>>>
>>>>>>>> It happens when the CEL files have been processed with gcRMA but
>>>>>>>> not when they have been processed with RMA.
>>>>>>>> I'm not positive about this but I think this error first started
>>>>>>>> happening with the upgrade to gcRMA 2.x I think it is happening
>>>>>>>> because gcRMA is producing a lot of probes with identical
>>>>>>>> expression values.
>>>>>>>>
>>>>>>>> Here is a test session that causes the error. Upon request I can
>>>>>>>> provide a tarball of the test data but any collection of 
>>>> CEL files
>>>>>>>> will reproduce the error.
>>>>>>>> The error also occurs if you run gcRMA from within 
>>>> onecolorGUI or
>>>>>>>> affylmGUI.
>>>>>>>> It also happens if you first write the expression data to a file
>>>>>>>> with write.exprs, then read it back in with read.table.
>>>>>>>>
>>>>>>>> #Loading the packages
>>>>>>>>> library(affy)
>>>>>>>> Loading required package: Biobase
>>>>>>>> Loading required package: tools
>>>>>>>>
>>>>>>>>> library(gcrma)
>>>>>>>> Loading required package: matchprobes Loading required package:
>>>>>>>> splines
>>>>>>>>
>>>>>>>>> library(LPE)
>>>>>>>>> set.seed(0)
>>>>>>>> #Reading in 4 CEL files
>>>>>>>>> test.Dat<-ReadAffy()
>>>>>>>> #Summarizing with gcRMA
>>>>>>>>> test.gcrma<-gcrma(test.Dat)
>>>>>>>> Adjusting for non-specific binding....Done.
>>>>>>>> Normalizing
>>>>>>>> Calculating Expression
>>>>>>>>
>>>>>>>> #Summarizing with RMA
>>>>>>>>> test.rma<-rma(test.Dat)
>>>>>>>> Background correcting
>>>>>>>> Normalizing
>>>>>>>> Calculating Expression
>>>>>>>>
>>>>>>>> #Extracting gcRMA assay data
>>>>>>>>> test.gcrma.MAT<-exprs(test.gcrma)
>>>>>>>>> dim(test.gcrma.MAT)
>>>>>>>> [1] 15611     4
>>>>>>>>
>>>>>>>> #Extracting RMA assay data
>>>>>>>>
>>>>>>>>> test.rma.MAT<-exprs(test.rma)
>>>>>>>>> dim(test.rma.MAT)
>>>>>>>> [1] 15611     4
>>>>>>>>
>>>>>>>> #Running LPE function on gcRMA data and the resulting error
>>>>>>>>> var.test.gcrma<-baseOlig.error(test.gcrma.MAT, q= 0.01)
>>>>>>>> Error in var.M.adap[i] <- ifelse(!is.na(var.M.adap[i - 1]),
>>>>>>>> mean(var.M.adap[i +  :
>>>>>>>> replacement has length zero
>>>>>>>>
>>>>>>>> #Running LPE function on RMA data - it successfully completes
>>>>>>>>> var.test.rma<-baseOlig.error(test.rma.MAT, q= 0.01)
>>>>>>>>>
>>>>>>>> #My session info
>>>>>>>>> sessionInfo()
>>>>>>>> R version 2.8.0 (2008-10-20)
>>>>>>>> i386-apple-darwin8.11.1
>>>>>>>>
>>>>>>>> locale:
>>>>>>>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>>>>>>
>>>>>>>> attached base packages:
>>>>>>>> [1] splines   tools     stats     graphics  grDevices utils
>>>>>>>> datasets  methods   base
>>>>>>>>
>>>>>>>> other attached packages:
>>>>>>>> [1] xenopuslaevisprobe_2.3.0 xenopuslaeviscdf_2.3.0
>>>>>>>> LPE_1.16.0               gcrma_2.14.0
>>>>>>>> matchprobes_1.14.0       affy_1.20.0              Biobase_2.2.0
>>>>>>>>
>>>>>>>> loaded via a namespace (and not attached):
>>>>>>>> [1] affyio_1.10.0        preprocessCore_1.4.0
>>>>>>>>
>>>>>>>> Thanks a lot for your help
>>>>>>>>
>>>>>>>> Charlie
>>>>>>>>
>>>>>>>>
>>>> -------------------------------------------------------------------
>>>>>>>> ---------
>>>>>>>>
>>>>>>>> Charlie Whittaker, Ph.D.
>>>>>>>> Bioinformatics and Computing Core Facility The David H. Koch
>>>>>>>> Institute for Integrative Cancer Research At MIT
>>>>>>>> 77 Mass Ave E18-366
>>>>>>>> Cambridge, MA 02139
>>>>>>>>
>>>>>>>> 617-324-0337
>>>>>>>>
> 

-- 
Richard D. Pearson             richard.pearson at postgrad.manchester.ac.uk
School of Computer Science,    http://www.cs.man.ac.uk/~pearsonr
University of Manchester,      Tel: +44 161 275 6178
Oxford Road,                   Mob: +44 7971 221181
Manchester M13 9PL, UK.        Fax: +44 161 275 6204



More information about the Bioconductor mailing list