[BioC] how to get chrom positions(bp) of a fragment delimited by 2 cytobands

Sean Davis sdavis2 at mail.nih.gov
Thu Oct 30 18:46:40 CET 2008


On Thu, Oct 30, 2008 at 1:26 PM, Al Tango <time.is.flying at gmail.com> wrote:
> Dear All,
>
> I want to find the chromosome start and end positions in bp for a fragment
> delineated by two cytobands (eg. 5q13.1-5q13.2) ( in large scale). I tried
> getBM( ) in package 'biomaRt', but didn't find a right attribute to use. If
> using "chromosome_location", then I will get many locations instead of just
> 'start' and 'end' positions (see below). Should I extract the 1st and last
> positions from the resulted table, or any other ways to go using biomaRt, or
> even ways without using biomaRt?
>
> Many thanks!
>
>> ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
>>
> res=getBM(attributes=c("chromosome_name","chromosome_location"),filters=c("chromosome_name","band_start",
> "band_end"),values=list("5","q13.1", "q13.2"), mart=mart)
>
>> dim(res)
> [1] 12208     2
>> res[1:10,]
>   chromosome_name chromosome_location
> 1                5                  NA
> 2                5            72892696
> 3                5            72893102
> 4                5            72893367
> 5                5            72893406
> 6                5            72893742
> 7                5            72893755
> 8                5            72893834
> 9                5            72893947
> 10               5            72894391

Biomart is transcript/gene centric.  This will get you gene chromosome
locations between those two points, I think.  Try this instead from
UCSC, instead:

t <- tempfile()
download.file('http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/cytoBand.txt.gz',t)
cytobands <- read.table(gzfile(t),sep="\t")
cytobands[1:5,]

    V1      V2       V3     V4     V5
1 chr1       0  2300000 p36.33   gneg
2 chr1 2300000  5300000 p36.32 gpos25
3 chr1 5300000  7100000 p36.31   gneg
4 chr1 7100000  9200000 p36.23 gpos25
5 chr1 9200000 12600000 p36.22   gneg

Sean


>> sessionInfo()
> R version 2.7.1 (2008-06-23)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] biomaRt_1.14.0 RCurl_0.9-3
>
> loaded via a namespace (and not attached):
> [1] XML_1.95-3
>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list