[BioC] Stuck with Yeast Tiling Array

Thu Sep 20 12:29:17 CEST 2007

Thanks Joern,

I'm starting to understand now. I am a total beginner at R, so all  
this is very new. How do i make an equivalent of the davidTiling 
$nucleicAcid ? If I type that i see this:

 > davidTiling$nucleicAcid
[1] genomic DNA genomic DNA genomic DNA poly(A) RNA poly(A) RNA poly 
(A) RNA
[7] total RNA   total RNA
Levels: genomic DNA poly(A) RNA total RNA

I need to create a similar thing for my (richard) data set. Any  
ideas? my isDNA and isRNA, are both empty at the moment...which is  
why it isn't working!

I have two DNA samples (and i'm just treating one as RNA) so I can  
work through the programs and see how it all works and where I need  
to get things modified, before I do a large batch of arrays (I've  
been messing with salt conditions so I need to see that these arrays  
actually have stuff hybed!!)

All I actually want to do is an equivalent to rma, but i think that  
the tilingArray/davidTiling libraries are the only resources  
available at the moment.. i think!?

Thanks for all your help,
Richard

On 19 Sep 2007, at 18:29, Joern Toedling wrote:

> Richard,
>
>> Thanks Joern,
>> The first part works great:
>>
>>> sampleNames(richard)
>> [1] "./ucont.CEL" "./utest.CEL"
>
> not sure, whether that preceding './' in front of every name is  
> what you
> would want, how about
> sampleNames(richard) <- dir(pattern=".CEL", full.names=FALSE)
> instead.
>> I was worried that when i use the davidTiling dataset and i do
>> sampleNames, each Cel file is preceeded by a number, so what i would
>> expect to see is:
>>
>> [1] "./ucont.CEL"
>> [2] "./utest.CEL"
> this is just how a character vector displayed in your console, with  
> the
> ExpressionSet in davidTiling the single entries in that sampleNames
> character vector are just too long such that only one can be displayed
> per line. The number preceding each line simply indicates which  
> element
> of the vector this line starts with.
> With your sampleNames the single entries are relatively short and  
> all of
> the can fit in one line, though.
>
>>
>> What I actually want to do is normalise the arrays, by using the
>> normalize by reference function. I now get the following error:
>>
>>
>>> isDNA = richard$nucleicAcid == "ucont"
>>> isRNA = richard$nucleicAcid == "utest"
>>> pm = PMindex(probeAnno)
>>> bg= BGindex(probeAnno)
>>> yn = normalizeByReference (richard [,isRNA] , reference = richard
>> [,isDNA], pm = pm, background +     = bg)
>>
>> Error in normalizeByReference(richard[, isRNA], reference = richard 
>> [,  :
>>         There is nothing to normalize in 'x'.
>
> Have you checked the contents of isDNA and isRNA? And are "pm" and  
> "bg"
> reasonable and is "bg" a subset of "pm". Please refer to the vignettes
> of davidTiling for more details on these. And does pData(richard)  
> have a
> column "nucleicAcid"? Since you only have two samples, it may be  
> easier
> to set
> isDNA <- 1; isRNA <- 2
> This is only useful, though, if the array with the cel file  
> 'ucont.CEL'
> really is a genomic-DNA-hybridization and "utest.CEL" an
> RNA-hybridization. If not, the whole normalization by a genomic-DNA
> hybridization may not be appropriate.
>
> Best regards,
> Joern
>