[BioC] snapCGH

Mon Nov 20 22:11:14 CET 2006

Hi
As Sean has said the methods available within snapCGH won't work if the 
Position and Chromosome elements aren't present in the $genes dataframe.

However I think the error you are currently seeing isn't related to 
that.  The processCGH function averages replicates of the clones and 
the ID argument specifies which column in $genes contains an identifier 
for each clone.  If you don't have such an identifier then the easiest 
thing to do is add a column with the name "ID" to $genes with the 
numbers from 1 to the length of the genes dataframe.

Hopefully the processCGH function will then work

Mike Smith

Quoting Sean Davis <sdavis2 at mail.nih.gov>:

> On Monday 20 November 2006 10:55, João Fadista wrote:
>>  Hi everyone,
>>
>> As I read the "snapCGH: Segmentation, Normalization and Processing of aCGH
>> Data User´s Guide" I became really excited with all the features in it to
>> analyse CGH data and because it is designed to be used in conjunction with
>> limma package, which I have already been using. I have done the practicals
>> and browsed the main functions using the data given in the package.
>>
>> After this stage I wanted to deal with a real data set so I downloaded a
>> CGH experiment from GEO (Gene Expression Omnibus) and put it on R workspace
>> using the GEOquery package. After that I converted the GEO DataSet into an
>> MAList to be able to use the data with snapCGH package.
>>
>> Despite of this, when I used the function processCGH it gave me an error:
>> > MA2 <- processCGH(MA, method.of.averaging=mean, ID="ID")
>>
>> Error in processCGH(MA, method.of.averaging = mean, ID = "ID") : $design
>> component is null
>>
>> So, then I managed to to make the design column, but it gave me an error,
> but a different one:
>> > MA$design <- rep(1,10)
>> >
>> > MA2 <- processCGH(MA, method.of.averaging=mean, ID="ID")
>>
>> Error in order(na.last, decreasing, ...) : argument 1 is not a vector
>>
>>
>>
>> Therefore, if MA is an object of class MAList this function should work. I
>> do not see what is wrong. Isn´t the snapCGH package compatible with the
>> GEO datasets?
>>
>> There is also another thing. In the examples folder of the package, the
>> clones.info file has the columns Chromosome and Position, but in the
>> dataset from GEO there is only the Entrez.GeneID identifier. Do you know of
>> anyway I could convert one into another?
>
> Hi, Joao.
>
> All the CGH methods that are available via bioconductor require a chromosome
> and basepair position.  They cannot work without these.  There are a number
> of ways to get chromosome location, but perhaps the simplest is to use the
> biomaRt package to go from gene_id to chromosome and position.  I don't think
> that you will be able to proceed without having the chromosome locations
> included in the MA$genes data frame and, although I am not sure, I would
> guess that the error is because of not having these.  Perhaps others on the
> list will confirm this.
>
> Sean
>
>