[BioC] Merging RandData object with names on the IRanges part

Thu Aug 20 19:13:22 CEST 2009

Ulrike,
First of all, I'm glad IRanges is useful for you.

Second, thanks for finding a bug in the rbind method for RangedData 
objects. Because of developer oversight, the duplicate names in the 
ranges was being handled differently than the duplicate rownames in the 
values. This has been corrected in a recent svn check-in to IRanges in 
the BioC 2.5 code line. You can get this updated IRanges package 
(version 1.3.58) either through svn access or wait 24-48 hours for the 
updated IRanges package to be placed on bioconductor.org and 
downloadable via biocLite.

 > suppressMessages(library(IRanges))

 > t1 <- RangedData(IRanges(start=c(7828367, 7828552,4121953), 
end=c(7828402, 7828587, 4121988)), space=c("Chr1", "Chr1", "Chr3"), 
mapq=c(1,2,1),flag=c(3,4,5))
 > rbind(t1, t1)
RangedData: 6 ranges by 2 columns on 2 sequences
colnames(2): mapq flag
names(2): Chr1 Chr3

 > t2 <- RangedData(IRanges(start=c(7828367, 7828552,4121953), 
end=c(7828402, 7828587, 4121988), names=c("a", "b", "c")), 
space=c("Chr1", "Chr1", "Chr3"), mapq=c(1,2,1),flag=c(3,4,5))
 > rbind(t2, t2)
RangedData: 6 ranges by 2 columns on 2 sequences
colnames(2): mapq flag
names(2): Chr1 Chr3

 > sessionInfo()
R version 2.10.0 Under development (unstable) (2009-08-05 r49073)
i386-apple-darwin9.7.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:
[1] IRanges_1.3.58

Patrick

Ulrike Goebel wrote:
> Dear list,
>
> I would like to do the following:
> Read an output file of BWA (SAM format) in "chunks"  and incrementally 
> build a RangedData object from
> the chunks (by 'rbind') . Ultimately that should be used to get the 
> number of reads per annotated transcript/region, but this is not the 
> question here.
>
> Assume as an example:
> t1 <- RangedData(IRanges(start=c(7828367, 7828552,4121953), 
> end=c(7828402, 7828587, 4121988)), space=c("Chr1", "Chr1", "Chr3"), 
> mapq=c(1,2,1),flag=c(3,4,5))
>
> I can merge two copies of this by 'rbind(t1,t1)'.
>
> But:
> t2 <- RangedData(IRanges(start=c(7828367, 7828552,4121953), 
> end=c(7828402, 7828587, 4121988), names=c("a", "b", "c")), 
> space=c("Chr1", "Chr1", "Chr3"), mapq=c(1,2,1),flag=c(3,4,5))
> (Here, I would like to keep the read names along with their positions 
> in the IRanges object).
>
> > rbind(t2,t2)
> Error in validObject(.Object) :
>  invalid class "RangedData" object: the names of the ranges must equal 
> the rownames
>
> Am I doing something completely wrong here ? Or is it confusing two 
> different meanings of 'names' ?
>
>
> BTW, I really like IRanges !
>
> Ulrike
> > sessionInfo()
> R version 2.10.0 Under development (unstable) (2009-08-01 r49053)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] grid      stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] ChIPR_1.1.3          MASS_7.3-0           spatstat_1.16-1
> [4] deldir_0.0-8         gpclib_1.4-4         mgcv_1.5-5
> [7] convert_1.21.1       marray_1.23.0        matchprobes_1.17.0
> [10] AnnotationDbi_1.7.11 Biostrings_2.13.29   TeachingDemos_2.4
> [13] Ringo_1.9.8          Matrix_0.999375-30   lattice_0.17-25
> [16] limma_2.19.2         RColorBrewer_1.0-2   Biobase_2.5.5
> [19] IRanges_1.3.56
>
> loaded via a namespace (and not attached):
> [1] affy_1.23.4          affyio_1.13.3        annotate_1.23.1
> [4] DBI_0.2-4            genefilter_1.25.7    nlme_3.1-92
> [7] preprocessCore_1.7.4 RSQLite_0.7-1        splines_2.10.0
> [10] survival_2.35-4      tools_2.10.0         xtable_1.5-5
>
>
>
>