[BioC] Merging RandData object with names on the IRanges part

Fri Aug 21 09:02:03 CEST 2009

Thanks a lot for all your suggestions/help ! I will have a look at 
Rsamtools (I was already wondering whether something like this exists), 
and, yes, maybe it is really not necessary to read the file in chunks.

Thanks again !

Ulrike

Ulrike Goebel wrote:
> Dear list,
>
> I would like to do the following:
> Read an output file of BWA (SAM format) in "chunks" and incrementally 
> build a RangedData object from
> the chunks (by 'rbind') . Ultimately that should be used to get the 
> number of reads per annotated transcript/region, but this is not the 
> question here.
>
> Assume as an example:
> t1 <- RangedData(IRanges(start=c(7828367, 7828552,4121953), 
> end=c(7828402, 7828587, 4121988)), space=c("Chr1", "Chr1", "Chr3"), 
> mapq=c(1,2,1),flag=c(3,4,5))
>
> I can merge two copies of this by 'rbind(t1,t1)'.
>
> But:
> t2 <- RangedData(IRanges(start=c(7828367, 7828552,4121953), 
> end=c(7828402, 7828587, 4121988), names=c("a", "b", "c")), 
> space=c("Chr1", "Chr1", "Chr3"), mapq=c(1,2,1),flag=c(3,4,5))
> (Here, I would like to keep the read names along with their positions 
> in the IRanges object).
>
> > rbind(t2,t2)
> Error in validObject(.Object) :
> invalid class "RangedData" object: the names of the ranges must equal 
> the rownames
>
> Am I doing something completely wrong here ? Or is it confusing two 
> different meanings of 'names' ?
>
>
> BTW, I really like IRanges !
>
> Ulrike
> > sessionInfo()
> R version 2.10.0 Under development (unstable) (2009-08-01 r49053)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] grid stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] ChIPR_1.1.3 MASS_7.3-0 spatstat_1.16-1
> [4] deldir_0.0-8 gpclib_1.4-4 mgcv_1.5-5
> [7] convert_1.21.1 marray_1.23.0 matchprobes_1.17.0
> [10] AnnotationDbi_1.7.11 Biostrings_2.13.29 TeachingDemos_2.4
> [13] Ringo_1.9.8 Matrix_0.999375-30 lattice_0.17-25
> [16] limma_2.19.2 RColorBrewer_1.0-2 Biobase_2.5.5
> [19] IRanges_1.3.56
>
> loaded via a namespace (and not attached):
> [1] affy_1.23.4 affyio_1.13.3 annotate_1.23.1
> [4] DBI_0.2-4 genefilter_1.25.7 nlme_3.1-92
> [7] preprocessCore_1.7.4 RSQLite_0.7-1 splines_2.10.0
> [10] survival_2.35-4 tools_2.10.0 xtable_1.5-5
>
>
>
>

-- 
  Dr. Ulrike Goebel
  Bioinformatics Support
  Max-Planck Institute for Plant Breeding Research
  Carl-von-Linne Weg 10
  50829 Cologne
  Germany
  +49(0) 221 5062 121