[BioC] Missing Autocorrelation in Ringo

Wed Jul 1 15:40:58 CEST 2009

Hello,

please find the answers below your questions.

> I coul now also find and use the extractProbeAnno function but get this:
> 
> > test<-extractProbeAnno(RG)
> Creating probeAnno mapping for chromosome Error in 
> split.default(probeindex, as.factor(hits[[chrNameColumn]])) :  
>  Group length is 0 but data length > 0 In addition: Warning message: 
> In extractProbeAnno(RG) :  Some reporters had no or an unrecognized 
> genome position in RG$genes$SystematicName.
> 
> You said that the function works with an RGlist object if it" 
> encodes the probe position in a colum called "SystematicName"". This 
> is the case but it looks like this:
> 
> > RG$genes$SystematicName[100]
> [1] "CGH_U00096_943375_943434_a"
> 
> Is this what you ment?

No, not really. With Agilent ChIP-chip that I have seen previously their
output contained a column SystematicName that looked like this
chr17:033719613-033719665 
and this is obviously what the parser what is written for. So I am afraid you
have to either rewrite extractProbeAnno for your data or generate the "pos"
data.frame by hand and then call function posToProbeAnno.

> 
> I also tried the autocorraltion again as before but did get the same 
> strange result. Could it be that it is a problem that the data in 
> the RG file are not ordered? I had problems with that when I used 
> some other peak detection because the program looked just at the 
> next entry without considering the probe position. My DNA fragments 
> where 1000bp and smaller with a peak at 500bp.

Well, in that case you obviously should have some degree of auto-correlation
at least up to 500bp. I cannot tell what went wrong as I have not come across
this problem before. The fact that the probes are not sorted should not
matter, as the probe positions are taken from the probeAnno object and not
from the row ordering of the RGList. If you want, I could have a more thorough
look into this, if you send me (off list!) the RGList and the "pos" data.frame
for the probeAnno. 

> About the judging of quality of found peaks, could you explain what 
> " resort to visualizations" means and how I would do this practically.

The vignette of the package contains examples of how to visualize
a.) your data in any genomic region that you specify
b.) the identified peaks
Please follow these examples and contact me if there are further questions.

Best regards,
Joern

---
Joern Toedling
Institut Curie -- U900
26 rue d'Ulm, 75005 Paris, FRANCE
Tel. +33 (0)156246926