[BioC] IRanges: list columns in RangedData objects (was Re: IRanges: cbind not well defined for RangedData?)

Patrick Aboyoun paboyoun at fhcrc.org
Fri Mar 19 20:59:43 CET 2010


Michael,
Thanks for the report. RangedData objects have been designed to hold 
list objects in the values columns. You did, however, find a bug the 
printing of a RangedData object when it contains a list column. I fixed 
the show method in both BioC 2.5 IRanges (>= 1.4.16) and BioC 2.6 
IRanges (>= 1.5.66) to handle this case.

 > rd <- RangedData(IRanges(start=1:4, width=10, names=paste("a",1:4)), 
space=1:2 )
 > rd$a.value <- rnorm(4)
 > rd$a.list <- as.list(1:4)
 > rd
RangedData with 4 rows and 2 value columns across 2 spaces
           space    ranges |   a.value   a.list
<character> <IRanges> | <numeric> <list>
a 1           1   [1, 10] | 0.5362468 ########
a 3           1   [3, 12] | 0.5459593 ########
a 2           2   [2, 11] | 0.4705777 ########
a 4           2   [4, 13] | 0.4160833 ########

As you noticed, a list column in a RangedData object will result in 
column expansion if you convert it to a data.frame, which can lead to 
large data object is the number of rows in a RangedData object is large. 
Since the show method prints out the classes of each of the columns, the 
user will be able to check to ensure their data columns are stored 
correctly prior to any conversion to a data.frame.

 > as.data.frame(rd)
   space start end width names   a.value a.list.1L a.list.2L a.list.3L 
a.list.4L
1     1     1  10    10   a 1 0.5362468         1         2         
3         4
2     1     3  12    10   a 3 0.5459593         1         2         
3         4
3     2     2  11    10   a 2 0.4705777         1         2         
3         4
4     2     4  13    10   a 4 0.4160833         1         2         
3         4



Patrick


On 3/19/10 7:23 AM, Michael Dondrup wrote:
> Dear Patrick and Michael,
>
> thank you very much for your helpful support on my last two connected issued! It is somehow in
> the documentation in the examples but I must have overlooked it.
>
> I tried it out immediately, and it works fine:
>
>    
>> rd = RangedData(IRanges(start=1:4, width=10, names=paste("a",1:4)), space=1:2 )
>> rd
>> rd$a.value = rnorm(4)
>> rd
>>      
> RangedData with 4 rows and 1 value column across 2 spaces
>          space    ranges |    a.value
>    <character>  <IRanges>  |<numeric>
> 1           1   [1, 10] | -0.6765515
> 2           1   [3, 12] |  1.5406962
> 3           2   [2, 11] | -1.2599696
> 4           2   [4, 13] |  0.4971178
>
> But then I had to reboot my computer because by accident tried this on a 100,000 ranges
> and the value was actually a list, not a vector, and then the re-cycling rule struck me:
>
>    
>> rd$a.list = as.list(1:4)
>>      
> first everything seems fine and normal but if you try to print it:
>    
>> rd
>>      
> RangedData with 4 rows and 1 value column across 2 spaces
> Error in .Method(..., deparse.level = deparse.level) :
>    number of rows of matrices must match (see arg 2)
> or try to convert into a data.frame:
>    
>> as.data.frame(rd)
>>      
>    space start end width names a.list.1L a.list.2L a.list.3L a.list.4L
> 1     1     1  10    10   a 1         1         2         3         4
> 2     1     3  12    10   a 3         1         2         3         4
> 3     2     2  11    10   a 2         1         2         3         4
> 4     2     4  13    10   a 4         1         2         3         4
>
> as I tried this, I R ran into some memory problems.
>
> This just as a warning,  to make sure you really use a vector here. Maybe something to put in the
> type checking, or documentation?
>
> Anyway, thanks a lot again
> Michael
>
>



More information about the Bioconductor mailing list