[BioC] Plotting Broad HMM states in Gviz

Elmar Tobi [guest] guest at bioconductor.org
Mon Jan 14 09:16:40 CET 2013


Hi all,

Recently I had problems to plot the Broad HMM in Gviz similar to that in UCSC 'dense' version. The package maintainer posted an excellent script to make it identical to the UCSC browser view: Many thanks to Florian!

Hi Elmar,
this is indeed a little tricky because the data in UCSC are spread across several tables. I think you will indeed have to fetch it for each of these tables individually. Looking at the table structure it seems that the data are stored in a BED-like structure, and the color is encoded in the itemRgb column. In your first attempt you didn't tell the UcscTrack constructor how to map the columns in the downloaded tables to the AnnotationTrack fields. You need to provide at least the mapping for the start and end coordinates ('chromStart' and 'chromEnd' columns) in the UCSC table, and to make use of the color information also the 'itemRgb'
column. The item names may also be useful, and we can get those from the 'name' column. Below is how a call would look like:

Broad1 <- UcscTrack(track='Broad ChromHMM', table="wgEncodeBroadHmmGm12878HMM", trackType="AnnotationTrack", genome='hg18', chromosome='chr18', name='12878', from=44675486, to=44679944, start="chromStart", end="chromEnd", feature="itemRgb", id="name", collapse=FALSE,
stacking="dense")

Note that I also turned off stacking for the track and collapsing, because we want to force all items to be on one line. Now we just need to tell the track how to color the items accordingly. That may look a bit exotic, in the ind it is rather simple: I define a bunch of display parameters with the same names as the features we downloaded from UCSC in the itemRgb column. The values of these parameters are the colors that have been stored as RGB values.

feat <- unique(feature(Broad1))
featCol <- setNames(as.list(rgb(t(sapply(strsplit(feat, ","), as.numeric)), maxColorValue=255)), feat)
displayPars(Broad1) <- featCol

Now you would have to do the same for all your tables, so maybe sticking the whole thing into lapply will make sense:


tracks <- lapply(c("wgEncodeBroadHmmGm12878HMM",
"wgEncodeBroadHmmH1hescHMM", "wgEncodeBroadHmmK562HMM", "wgEncodeBroadHmmHepg2HMM", "wgEncodeBroadHmmHuvecHMM", "wgEncodeBroadHmmHmecHMM"), function(table){
	track <- UcscTrack(track='Broad ChromHMM', table=table, trackType="AnnotationTrack", genome='hg18', chromosome='chr18',
		name=gsub("^wgEncodeBroadHmm|HMM$", "", table), from=44675486, to=44679944, start="chromStart", end="chromEnd", feature="itemRgb", id="name", collapse=FALSE, stacking="dense")

	feat <- unique(feature(track))
	featCol <- setNames(as.list(rgb(t(sapply(strsplit(feat, ","), as.numeric)), maxColorValue=255)), feat)
	displayPars(track) <- featCol
	track
})

plotTracks(tracks)


Is that more or less what you were looking for?

Florian


 -- output of sessionInfo(): 


tracks <- lapply(c("wgEncodeBroadHmmGm12878HMM",
"wgEncodeBroadHmmH1hescHMM", "wgEncodeBroadHmmK562HMM", "wgEncodeBroadHmmHepg2HMM", "wgEncodeBroadHmmHuvecHMM", "wgEncodeBroadHmmHmecHMM"), function(table){
	track <- UcscTrack(track='Broad ChromHMM', table=table, trackType="AnnotationTrack", genome='hg18', chromosome='chr18',
		name=gsub("^wgEncodeBroadHmm|HMM$", "", table), from=44675486, to=44679944, start="chromStart", end="chromEnd", feature="itemRgb", id="name", collapse=FALSE, stacking="dense")

	feat <- unique(feature(track))
	featCol <- setNames(as.list(rgb(t(sapply(strsplit(feat, ","), as.numeric)), maxColorValue=255)), feat)
	displayPars(track) <- featCol
	track
})

plotTracks(tracks)

--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list