[BioC] Creating a new instance of oligoSnpSet

Steven McKinney smckinney at bccrc.ca
Wed Nov 26 20:57:08 CET 2008


Hi all,

Thanks to Robert Scharpf for a quick and detailed
off-line response.  For anyone else that may encounter
this issue:  my problem was that my featureData object's
'data' slot data frame did not have names "chromosome" 
and "position" .

I originally defined my featureData object as

> cclfd <-
+   new("AnnotatedDataFrame",
+       data = data.frame(position = pData(featureData(ccld)[, "MapInfo"]),
+         chromosome = pData(featureData(ccld)[, "CHR"]),
+         stringsAsFactors = FALSE),
+       varMetadata = data.frame(labelDescription = c("position", "chromosome")))

extracting directly from my ccld object (a SnpSetIllumina object
from beadarraySNP command read.SnpSetIllumina()
 ccld <- read.SnpSetIllumina(samplesheet = "ccl_CNV370SampleSheet_8samples.csv",
                             reportfile = "ccl_FinalReport_2.txt")
)


This yielded an AnnotatedDataFrame object with slot 'data'
containing a data frame whose names were not those I had
put in the data.frame() code above (namely "position"
and "chromosome").

> str(cclfd)
Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
  ..@ varMetadata      :'data.frame':	2 obs. of  1 variable:
  .. ..$ labelDescription: chr [1:2] "position" "chromosome"
  ..@ data             :'data.frame':	373397 obs. of  2 variables:
  .. ..$ MapInfo: num [1:373397] 1.64e+08 1.66e+08 1.66e+08 1.66e+08 1.67e+08 ...
  .. ..$ CHR    : Factor w/ 25 levels "1","10","11",..: 18 18 18 18 18 18 18 18 18 18 ...
  .. .. ..- attr(*, "names")= chr [1:373397] "cnvi0000001" "cnvi0000002" "cnvi0000003" "cnvi0000004" ...
  ..@ dimLabels        : chr [1:2] "rowNames" "columnNames"
  ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slots
  .. .. ..@ .Data:List of 1
  .. .. .. ..$ : int [1:3] 1 1 0

So that's my R lesson for today - names specified in a
data.frame() call don't necessarily stick!


Explicitly forcing column names and
mode "character" for the chromosome column
solves the problem

 ccld.position <- pData(featureData(ccld)[, "MapInfo"])
 names(ccld.position) <- "position"
 ccld.chromosome <- pData(featureData(ccld)[, "CHR"])
 names(ccld.chromosome) <- "chromosome"
 ccld.chromosome$chromosome <- as.character(ccld.chromosome$chromosome)

 cclfd <-
   new("AnnotatedDataFrame",
       data = data.frame(position = ccld.position,
         chromosome = ccld.chromosome,
         stringsAsFactors = FALSE),
       varMetadata = data.frame(labelDescription = c("position", "chromosome")))
 
and I can create the oligoSnpSet object successfully.

> cclss <-
+   new("oligoSnpSet", copyNumber = logR, calls = gt,
+       phenoData = annotatedDataFrameFrom(logR, byrow = FALSE),
+       featureData = cclfd, annotation = "HumanCNV370-Quad")
> str(cclss)
Formal class 'oligoSnpSet' [package "oligoClasses"] with 6 slots


So it was the absence of columns named "chromosome" and "position"
in the 'data' slot of the featureData object that caused internal 
code to attempt to acquire chromosome positional information from 
an annotation source.

With the featureData at data data frame having the correct column
labels "chromosome" and "position", the annotation argument
is not processed further (it is just added to the SnpSet
object's 'annotation' slot).

Thanks again to Robert Scharpf.

Best

Steve McKinney




-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch on behalf of Steven McKinney
Sent: Tue 11/25/2008 9:56 PM
To: Bioconductor at stat.math.ethz.ch
Subject: [BioC] Creating a new instance of oligoSnpSet
 
Hello All,

I am trying to get some Illumina HumanCNV370-Quad
data into VanillaICE to do some copy number analysis.

In attempting to create an object of class "oligoSnpSet"
I can not seem to specify an annotation that works.

e.g. as specified in a vignette

> cclss <-
+   new("oligoSnpSet", copyNumber = logR, calls = gt,
+       phenoData = annotatedDataFrameFrom(logR, byrow = FALSE),
+       featureData = cclfd, annotation = "Illumina550k")
Loading required package: Illumina550k
Error in db(object) : Illumina550k package not available
In addition: Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called 'Illumina550k'
Error in dbGetQuery(db(object), sql) : 
  error in evaluating the argument 'conn' in selecting a method for function 'dbGetQuery'

or even if I specify some annotation that does exist

> cclss <-
+   new("oligoSnpSet", copyNumber = logR, calls = gt,
+       phenoData = annotatedDataFrameFrom(logR, byrow = FALSE),
+       featureData = cclfd, annotation = "hgu133plus2cdf")
Loading required package: hgu133plus2cdf
Error in db(object) : 
  trying to get slot "getdb" from an object of a basic class ("environment") with no slots
Error in dbGetQuery(db(object), sql) : 
  error in evaluating the argument 'conn' in selecting a method for function 'dbGetQuery'


Is there a way to work around this annotation bit of building
an eSet object? 

I can't figure out from documentation, reading source code, or
experimenting, as to what will work for this annotation argument.

I'm a bit hooped as there does not yet appear to be annotation
for the Illumina HumanCNV370-Quad, but I have annotation
information from other files from Illumina etc.

Can I put some dummy object as an argument for annotation
and patch it up with my known info?

Any ideas?


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list