[BioC] saving GRanges objects - resulting file size issue

Valerie Obenchain vobencha at fhcrc.org
Mon Aug 19 15:40:27 CEST 2013


Hi Andrew,

I can't reproduce this problem. Can you successfully save the files in a 
different order, i.e., is it just the addition of 'introns' that is 
giving you trouble?

I assume these are custom annotation GRanges that did not come from a 
TxDb in Bioconductor. Can you make the files available for testing?

Valerie

On 08/16/2013 10:13 AM, Andrew Jaffe wrote:
> Hey,
>
> I'm trying to save several GRanges objects in the same rda file, but for
> some reason, one of the smaller GRanges objects (~23Mb) makes the saved
> file go from 25Mb to 9Gb+ and I'm unsure of why exactly, and have never
> seen this problem before:
>
>> class(exons)
> [1] "GRanges"
> attr(,"package")
> [1] "GenomicRanges"
>> class(utr5seg)
> [1] "GRanges"
> attr(,"package")
> [1] "GenomicRanges"
>> class(utr3seg)
> [1] "GRanges"
> attr(,"package")
> [1] "GenomicRanges"
>> class(introns)
> [1] "GRanges"
> attr(,"package")
> [1] "GenomicRanges"
>> save(exons, utr5seg, utr3seg, trans, tInfo, t2g, # introns,
>   file="annotation-tables-forRNAseq.rda")
>
> the resulting saved object size in my directory is 23mb
>
>> print(object.size(introns),units="Mb")
> 22.9 Mb
>
> but when i include this `introns` GRanges object by running:
>
> save(exons, utr5seg, utr3seg, trans, tInfo, t2g, introns,
> file="annotation-tables-forRNAseq.rda")
>
> the object size in my directory got up to 9Gb, and I have to kill the save.
> any idea whats going on??
>
>> introns
> GRanges with 691807 ranges and 2 metadata columns:
>             seqnames               ranges strand   |       tx_id
>                <Rle>            <IRanges>  <Rle>   | <character>
>         [1]     chr1     [ 12228,  12645]      +   |       1,2,3
>         [2]     chr1     [ 12698,  13402]      +   |       1,2,3
>         [3]     chr1     [322229, 324287]      +   |           7
>         [4]     chr1     [324346, 324438]      +   |           7
>         [5]     chr1     [324061, 324287]      +   |           8
>         ...      ...                  ...    ... ...         ...
>    [691803]     chrY [27216989, 27218792]      -   |       76941
>    [691804]     chrY [27218869, 27245878]      -   |       76941
>    [691805]     chrY [27329896, 27330860]      -   |       76942
>    [691806]     chrY [59359509, 59360006]      -   |       76950
>    [691807]     chrY [59360116, 59360500]      -   |       76950
>                                      tx_name
>                                  <character>
>         [1] uc001aaa.3,uc010nxq.1,uc010nxr.1
>         [2] uc001aaa.3,uc010nxq.1,uc010nxr.1
>         [3]                       uc009vjk.2
>         [4]                       uc009vjk.2
>         [5]                       uc001aau.3
>         ...                              ...
>    [691803]                       uc011nbv.2
>    [691804]                       uc011nbv.2
>    [691805]                       uc004fwt.3
>    [691806]                       uc011ncc.1
>    [691807]                       uc011ncc.1
>    ---
>    seqlengths:
>          chr1      chr2      chr3      chr4 ...     chr22      chrX      chrY
>     249250621 243199373 198022430 191154276 ...  51304566 155270560  59373566
>
>
> Thanks a ton,
> Andrew
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list