[BioC] biomaRt: retrieve total chromosome lengths

Robert Gentleman rgentlem at fhcrc.org
Tue Oct 31 16:46:12 CET 2006


they are of course included in all Bioc chip annotation packages
for example,

 > hgu95av2CHRLENGTHS
         1         2         3         4         5         6         7 
        8
246127941 243615958 199344050 191731959 181034922 170914576 158545518 
146308819
         9        10        11        12        13        14        15 
       16
136372045 135037215 134482954 132078379 113042980 105311216 100256656 
90041932
        17        18        19        20        21        22         X 
        Y
  81860266  76115139  63811651  63741868  46976097  49396972 153692391 
50286555
         M
     16571

so while one can get them from the web, their are alternatives


Sean Davis wrote:
> On Tuesday 31 October 2006 03:15, De Bondt, An-7114 [PRDBE] wrote:
>> Hi Steffen,
>> Hi Jim,
>>
>> Thanks for your suggestions!
>> To avoid hard coding, I'll retrieve indeed the end position of the last
>> transcript on each of the chromosomes.  This is, relatively seen, pretty
>> close to the real length of the chromosome.
> 
> Another simple solution is to use information from UCSC (who use the same 
> chromosomes for building as ensembl, at least for human and mouse, and 
> probably many others).  As an example, for the human genome build from March 
> 2006 (called hg18 by UCSC), one can simply download and read this file using 
> R:
> 
> http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/chromInfo.txt.gz
> 
> which is a tab-delimited file that has as columns 1 and 2 the chromosome name 
> ('chr1', 'chr2', etc.) and for the second column has the total base count for 
> the chromosome.  
> 
> Sean
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list