[BioC] Using GenomicRanges with table data

Tom Oates toates19 at gmail.com
Fri Jan 11 12:54:25 CET 2013


Hi
I am trying to use GenomicRanges as part of an anlalysis of sequencing data.
I have a number of files which I wish to use to make GRanges objects.
For example:

chr1    579578    579804    CpG_12
chr1    630418    630623    CpG_11
chr1    804552    804763    CpG_9
chr1    1307051    1307362    CpG_16
chr1    1323599    1323808    CpG_9
chr1    1350549    1350758    CpG_12
chr1    1403287    1403637    CpG_20
chr1    1418906    1419488    CpG_28

This file is sorted such that chr1 is followed chr2, 3, 4, 5 etc to 20
(as opposed to chr10, 11...19, 2, 3 etc)

I use to make the GRanges object
cpgi_gr<-GRanges(seqnames=Rle(cpgi$V1),
ranges=IRanges(start=cpgi$V2,end=cpgi$V3),
UCSC_AL_ID=cpgi$V4)

but then if I examine

seqnames(cpgi_gr)


I get
factor-Rle of length 89611 with 21 runs
  Lengths:  8952  5602  6133  4973  5840  4260 ...  3132  3607  2793
3175  3842  1419
  Values :  chr1  chr2  chr3  chr4  chr5  chr6 ... chr16 chr17 chr18
chr19 chr20  chrX
Levels(21): chr1 chr10 chr11 chr12 chr13 chr14 chr15 ... chr5 chr6
chr7 chr8 chr9 chrX


So the Values & Levels are not matching.  I hope to give the GRanges
object seqlengths of the chr lengths in the genome so I can then
perform flank etc tasks on the data so it is crucial that the values &
lengths match.  I imagine that this problem is based around my not
understanding either IRanges or Rle sufficiently but I have read help
on Rle objects & IRanges and can't work out how to ensure that the
formation of the GRanges object leads to the chr values matching

Thanks



More information about the Bioconductor mailing list