[BioC] biomaRt: retrieve total chromosome lengths

Sean Davis sdavis2 at mail.nih.gov
Tue Oct 31 12:39:16 CET 2006


On Tuesday 31 October 2006 03:15, De Bondt, An-7114 [PRDBE] wrote:
> Hi Steffen,
> Hi Jim,
>
> Thanks for your suggestions!
> To avoid hard coding, I'll retrieve indeed the end position of the last
> transcript on each of the chromosomes.  This is, relatively seen, pretty
> close to the real length of the chromosome.

Another simple solution is to use information from UCSC (who use the same 
chromosomes for building as ensembl, at least for human and mouse, and 
probably many others).  As an example, for the human genome build from March 
2006 (called hg18 by UCSC), one can simply download and read this file using 
R:

http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/chromInfo.txt.gz

which is a tab-delimited file that has as columns 1 and 2 the chromosome name 
('chr1', 'chr2', etc.) and for the second column has the total base count for 
the chromosome.  

Sean



More information about the Bioconductor mailing list