[BioC] error in matchprobes package

Yan Zhang yzhang at vbi.vt.edu
Fri Sep 14 16:16:22 CEST 2007


Zhijin:

Thank you very much for identify the problem.

yan

Zhijin Wu wrote:

> I think Saroj is right. The cdf package gives only 153525 pm index, 
> but there are 158334 probe sequences. So there are some probes with 
> sequence but the cdf package doesn't have their location.
> GCRMA is able to handle the situation that sequence for some probes 
> are not given (since in the past there are incomplete probe sequence 
> files), but it expects no more probe sequence than those given in cdf 
> pkg.
>
> I could adjust gcrma to take the intersection of what cdf and probe 
> packages have in common, but I wonder if this discrepancy between cdf 
> and probe package is something we would expect.
>
> pmIndex <- unlist(indexProbes(new("AffyBatch", cdfName = 
> "ehis1a520285f"),"pm"))
> length(pmIndex)
> [1] 153525
>
> p=get("ehis1a520285fprobe")$sequence
> > length(p)
> [1] 158334
>
>
>
>
> Yan Zhang wrote:
>
>> Saroj:
>>
>> I think you are talking about is an old issue.
>> I already edited probesequence file and removed those mismatched 
>> probeset ID. Right now, they are matched to each other. Then, I used  
>> matched CDF and probesequence files to generate packages using R2.6 
>> alpha. When I run GCRMA, I still got this error message.
>>
>> Adjusting for optical effect..Done.
>> Computing affinities.Error in tmp.exprs[pmIndex[subIndex]] = apm :
>>  NAs are not allowed in subscripted assignments
>>
>> That is the reason that I sent e-mail to Dr. Wu and bioconductor list 
>> asked for further help.
>>
>> We could discuss this issue sometime today.
>>
>> best
>> yan
>>
>> smohapat at vbi.vt.edu wrote:
>>
>>> Jim and all,
>>>
>>> I have been following the messages online and had a chance to talk with
>>> Yan and look at the error messages yesterday.
>>>
>>> I think the problem is caused by a discrepancy between the cdf and
>>> probeseq files that Yan received from his collaborator. As I understand
>>> number of probeset id in probesequence file is more than that in CDF 
>>> file.
>>> 614 probeset ID could not be found in CDF file. Yan, please correct 
>>> me if
>>> I am wrong.
>>>
>>> I am guessing that matchprobes adds NAs for the ids missing in the 
>>> CDF and
>>> this causes the error during gcrma.
>>>
>>> Best,
>>>
>>> Saroj
>>>
>>> On Thu, September 13, 2007 11:38 am, James W. MacDonald wrote:
>>>  
>>>
>>>> Hi Yan,
>>>>
>>>>
>>>> I have no idea why you were having problems with this, unless you 
>>>> didn't
>>>> upgrade to R-devel like I suggested. I didn't have any problems 
>>>> building
>>>> this package.
>>>>
>>>> Rather than trying to talk you through building this yourself, I have
>>>> put it up for download:
>>>>
>>>> http://www.umich.edu/~jmacdon/ehis1a520285fprobe_0.0.1.tar.gz
>>>>
>>>>
>>>> Best,
>>>>
>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>>
>>>> yzhang at vbi.vt.edu wrote:
>>>>  
>>>>
>>>>> Jim:
>>>>>
>>>>>
>>>>> I put my cdf and probesequence file and one cel file at the following
>>>>> url. if you are willing to repeat my problem, you could download them
>>>>> and try in your machine. http://ci.vbi.vt.edu/yan/newcdf/huber.html
>>>>> Thanks a lot.
>>>>> yan
>>>>>
>>>>>
>>>>> On Wed, September 12, 2007 4:01 pm, James W. MacDonald wrote:
>>>>>
>>>>>
>>>>>    
>>>>>
>>>>>> Yan Zhang wrote:
>>>>>>
>>>>>>
>>>>>>      
>>>>>>
>>>>>>> jim:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I am wrong. That chip did have MM. I just checked it using mm
>>>>>>> function in affy package. The reason that I think it is only has pm
>>>>>>> is because only pm in probesequence file.  Then, do you have some
>>>>>>> suggestion to solve that error message?
>>>>>>>         
>>>>>>
>>>>>> Sure. You have two choices. You can add comparewithcdf=FALSE to your
>>>>>> call to makeProbePackage(), which will eliminate the warnings 
>>>>>> because
>>>>>> you will no longer be comparing to the cdf. This is the simplest
>>>>>> answer, but regrettably the most dangerous as well.
>>>>>>
>>>>>> Otherwise, you could
>>>>>>
>>>>>>
>>>>>>
>>>>>> debug(.lgExtraParanoia)
>>>>>>
>>>>>> before running makeProbePackage(), and then step through that
>>>>>> function, looking at what you get for pm1, mm1, pm2, and mm2 to see
>>>>>> why you are getting the error in the first place. I have to 
>>>>>> assume one
>>>>>> of those variables is ending up as an NA (usually this happens 
>>>>>> because
>>>>>> there aren't any MMs). Then you will have to figure out what to do
>>>>>> with this information.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>      
>>>>>>
>>>>>>> best yan
>>>>>>>
>>>>>>> James W. MacDonald wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>        
>>>>>>>
>>>>>>>> Hi Yan,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> First, please don't take things off-list. The archives are
>>>>>>>> intended to be a resource, and if the questions/answers become
>>>>>>>> private then we have less of a resource.
>>>>>>>>
>>>>>>>> Yan Zhang wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>          
>>>>>>>>
>>>>>>>>> Thank you very much for your response.
>>>>>>>>> Yes, that chip only has PM. Then, what can I do?
>>>>>>>>> I need to solve this problem in order to continue.
>>>>>>>>> For warning message,
>>>>>>>>> Can I just ignore that warning messages? I doubled. Because
>>>>>>>>> later, when I using GCRMA, those NA will cause trouble in the
>>>>>>>>> compute.infinite function. What can I do? Can I just delete the
>>>>>>>>> head of probesequence file?
>>>>>>>>>             
>>>>>>>>
>>>>>>>> You won't be able to do GCRMA with a PM-only chip. GCRMA uses the
>>>>>>>> MM
>>>>>>>> probes to compute a background estimate, and if you don't have MM
>>>>>>>> probes you won't be able to do that.
>>>>>>>>
>>>>>>>> As for the second question (which is a moot point now), you don't
>>>>>>>> want to delete the head of the probe_tab file. As I mentioned in
>>>>>>>> my earlier reply you would need to use the devel version of
>>>>>>>> matchprobes with R-2.6.0alpha.
>>>>>>>>
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Jim
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>          
>>>>>>>>
>>>>>>>>> best yan
>>>>>>>>>
>>>>>>>>> James W. MacDonald wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>            
>>>>>>>>>
>>>>>>>>>> Hi Yan,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> yzhang at vbi.vt.edu wrote:
>>>>>>>>>>
>>>>>>>>>>              
>>>>>>>>>>
>>>>>>>>>>> When I use makeProbePackage function in newest version
>>>>>>>>>>> matchprobes package(1.8.1), I got the following error
>>>>>>>>>>> message:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                
>>>>>>>>>>>
>>>>>>>>>>>> makeProbePackage("ehis1a520285f",version="1.0",species="e
>>>>>>>>>>>> his"
>>>>>>>>>>>> ,maintainer="yanzhang<yzhang at vbi.vt.edu>",build=FALSE,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                   
>>>>>>>>>>>
>>>>>>>>>>> check=FALSE, force=True) Importing the data. Error in
>>>>>>>>>>> rep(NA, max(pm1, mm1, pm2, mm2)) : invalid 'times' argument
>>>>>>>>>>> In addition: Warning messages:
>>>>>>>>>>> 1: NAs introduced by coercion in:
>>>>>>>>>>> as.integer.default(dat[[2]]) 2: NAs introduced by coercion
>>>>>>>>>>> in: as.integer.default(dat[[3]])
>>>>>>>>>>> 3: NAs introduced by coercion in:
>>>>>>>>>>> as.integer.default(dat[[4]])
>>>>>>>>>>>
>>>>>>>>>>>                 
>>>>>>>>>>
>>>>>>>>>> The error comes from code that compares the probeset IDs from
>>>>>>>>>> the probe package with the cdf package, and IIRC this happens
>>>>>>>>>> when you have a PM-only chip. Is this chip PM-only?
>>>>>>>>>>
>>>>>>>>>> The warnings come from an unfortunate change that was made to
>>>>>>>>>> getProbeDataAffy() that I have fixed in the devel version
>>>>>>>>>> (and
>>>>>>>>>> have no idea right now why I didn't push to the release as
>>>>>>>>>> well...). The problem stems from the fact that you are
>>>>>>>>>> reading in the whole probe_tab file, including the header.
>>>>>>>>>> When the (x,y)
>>>>>>>>>> coordinates and probe interrogation position data are coerced
>>>>>>>>>> to integer, the first value for each is character, which is
>>>>>>>>>> coerced to a NA.
>>>>>>>>>>
>>>>>>>>>> The release branch is no longer being built, so I cannot push
>>>>>>>>>> a fix that will end up being available. The easiest thing for
>>>>>>>>>> you to do is upgrade your R to 2.6.0 alpha and use the devel
>>>>>>>>>> version of matchprobes.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Jim
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>              
>>>>>>>>>>
>>>>>>>>>>> I don't have this problem if I use old version(1.0.22).
>>>>>>>>>>> Anyonne knows what cause this?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> best yan
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Bioconductor mailing list
>>>>>>>>>>> Bioconductor at stat.math.ethz.ch
>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>>>>>> Search the archives:
>>>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.cond
>>>>>>>>>>> ucto r
>>>>>>>>>>>                 
>>>>>>>>>>
>>>>>>>>>>               
>>>>>>>>>
>>>>>> -- 
>>>>>> James W. MacDonald, M.S.
>>>>>> Biostatistician
>>>>>> Affymetrix and cDNA Microarray Core
>>>>>> University of Michigan Cancer Center
>>>>>> 1500 E. Medical Center Drive
>>>>>> 7410 CCGC
>>>>>> Ann Arbor MI 48109
>>>>>> 734-647-5623
>>>>>>
>>>>>>
>>>>>>
>>>>>>       
>>>>>
>>>>>     
>>>>
>>>> -- 
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> Affymetrix and cDNA Microarray Core
>>>> University of Michigan Cancer Center
>>>> 1500 E. Medical Center Drive
>>>> 7410 CCGC
>>>> Ann Arbor MI 48109
>>>> 734-647-5623
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>>
>>>>   
>>>
>>>
>>>  
>>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>


More information about the Bioconductor mailing list