[BioC] IRanges: cbind not well defined for RangedData?

Michael Dondrup Michael.Dondrup at uni.no
Thu Mar 18 15:55:07 CET 2010


Hi,
here is another little possible glitch with RangedData and cbind(), actually would like to propose to 
change or expand the behavior of the cbind function or to add to it's documentation. The use-case is as 
follows: 
Assume we have some chromosomal Ranges in a RangedData object. Then we can iteratively compute statistics  on
these ranges and attach them to the DataFrame holding extra data, e.g. some count data or combine qualitiy scores possibly from multiple conditions. 

So according to the documentation of the RangedData-class,
> The first mode treats the object as a contiguous "data frame" annotated with range information. 
>The accessors start, end, and width get the corresponding fields in the ranges as atomic integer vectors, undoing 
> the division over the spaces. The [[ > and matrix-style [, extraction and subsetting functions unroll the data in the same way. [[<- does the inverse. 
I assume I could use cbind(rd, a.value) to attach the statistics to the internal data representation. So would it be possible to
make cbind return something more useful, or are there better ways to do it?


Best
Michael


Example:

> a.value = rnorm(4)
> rd1 = RangedData(ranges=IRanges(start=runif(4, min=1, max=10E8), width=runif(4, min=1, max=10E5), names=paste("bla",1:4)), space=1:2)
> rd1
RangedData with 4 rows and 0 value columns across 2 spaces
            space                 ranges |
      <character>              <IRanges> |
bla 1           1 [773679042, 774010137] |
bla 3           1 [194819013, 195136171] |
bla 2           2 [183105318, 183509803] |
bla 4           2 [107730452, 107823748] |

>  obj = cbind(rd1, a.value)

And I would intuitively assume the result to look exactly like this:

> RangedData(ranges=IRanges(start=runif(4, min=1, max=10E8), width=runif(4, min=1, max=10E5), names=paste("bla",1:4)), space=1:2, a.value)
RangedData with 4 rows and 1 value column across 2 spaces
            space                 ranges |    a.value
      <character>              <IRanges> |  <numeric>
bla 1           1 [473042533, 473820859] | -1.7956588
bla 3           1 [ 75991383,  76022516] |  0.3588571
bla 2           2 [475385363, 476224756] |  1.4166218
bla 4           2 [532603052, 532902678] |  0.2324424

But what I get is much different:

> class(obj)
[1] "matrix"
> typeof(obj)
[1] "list"

> obj
     rd1 a.value   
[1,] ?   0.3255676 
[2,] ?   0.5913471 
[3,] ?   0.9317755 
[4,] ?   -0.8897527

> sessionInfo()
R version 2.10.1 (2009-12-14) 
x86_64-apple-darwin9.8.0 

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] IRanges_1.4.9

loaded via a namespace (and not attached):
[1] tools_2.10.1



More information about the Bioconductor mailing list