[BioC] allen brain atlas

Radek Blatny blatny at img.cas.cz
Mon Aug 25 16:01:11 CEST 2008


Hi Martin, thank you very much for the additional code. Not having a  
batch query is strange since the web interface has a possibility to  
query multiple genes at once (separated by coma) so I would expect to  
have this possibility in the API as well. Another solution might be  
to download the whole expression data in an xml file and parse it  
into an R table-type object. However I wasn't able to find the whole  
db in an xml file.

Regards, Radek



On Aug 25, 2008, at 3:47 PM, Martin Morgan wrote:

Hi Radek --

"Radek Blatny" <Radek.Blatny at img.cas.cz> writes:

> Hi Martin,
>
> the code works perfectly, thank you! I've got one more question. Is  
> there
> a way to do a batch query?
>
> When I tried to send a list of symbols to the script, the process  
> stopped
> with an error if one of the genes was missing in the ABA and there  
> was no
> data retrieved at all. Any suggestions?

It seems like the ABA lets you query individual terms, or to do a
search, but not to do a batch-style query. A solution might be

abaBatchGenes <- function(symbols) {
     res <- lapply(symbols, function(symbol) {
         tryCatch(abaGene(symbol),
                  error=function(err) {
                      list(gene_expressions=data.frame(
                             avgdensity=numeric(),
                             avglevel=numeric()),
                           image_series=data.frame(
                             imageseriesdisplayname=character(),
                             age=integer(),
                             imageseriesid=integer(),
                             sex=factor(levels=c("F", "M"))))
                  })
     })
     names(res) <- symbols
     res
}

and then

> abaBatchGenes(c("Coch", "Foo"))

> Ideally, I would like to send a list of gene symbols, skip the genes,
> which are not present in the ABA and retrieve a dataframe  
> containing the
> records consisiting of a gene symbol and expression values for each  
> brain
> region.

Backing up, and with your specific objective in mind, one might

abaGene1 <- function(symbol) {
     url <- abaUrl("gene", symbol)
     xml <- xmlTreeParse(url, useInternal=TRUE)
     data.frame(symbol=symbol,
                structurename=xq(xml, "//structurename"),
                avgdensity=as.numeric(xq(xml, "//avgdensity")),
                avglevel=as.numeric(xq(xml, "//avglevel")),
                row.names=seq_len(xq(xml, "count(//structurename)")))
}

abaBatchGene1 <- function(symbols) {
     res <- lapply(symbols, function(symbol) {
         tryCatch(abaGene1(symbol),
                  error=function(err) FALSE)
     })
     do.call("rbind", Filter(is.data.frame, res))
}

and then

> abaBatchGene1(c("Coch", "Foo", "Calb1"))
failed to load HTTP resource
    symbol                  structurename avgdensity   avglevel
1    Coch                     Cerebellum   7.888339  11.977408
2    Coch                Cerebral cortex  52.086498  46.967388
[SNIP]
16   Coch        Striatum ventral region   6.137324   4.482482
17   Coch                       Thalamus  27.540497  36.667355
18  Calb1                     Cerebellum  67.864638  95.345704
19  Calb1                Cerebral cortex  76.402445  89.302411
[SNIP]
33  Calb1        Striatum ventral region  88.071941  87.418639
34  Calb1                       Thalamus  79.896777  87.737941

Martin

> Thanks, Radek
>
>> Hi Radek --
>>
>> Radek Blatny <blatny at img.cas.cz> writes:
>>
>>> Hi, is there a way to access the expression data from the Allen  
>>> Brain
>>> Atlas and use them for annotation of a topTable? I am particularly
>>> interested in being able to download expression profiles for gene
>>> symbols (or other suitable identifiers), which is possible to query
>>> in semiquantitative graphical format for all brain anatomic regions
>>> at the ABA webpage - for instance here:
>>>
>>> <http://mouse.brain-map.org/brain/Myst4.html?ispopup=1>
>>
>> Not sure of a package, but it might be easy enough to create  
>> something
>> useful yourself. Here's a quite fun first attempt:
>>
>> library(XML)
>>
>> ## some utitlity functions
>>
>> abaUrl <- function(type, id = "") {
>>     if (!missing(id))
>>         id <- paste("/",id, ".xml", sep="")
>>     paste("http://www.brain-map.org/aba/api/", type, id, sep="")
>> }
>>
>> xq <- function(xml, query) {
>>     unlist(xpathApply(xml, query, xmlValue))
>> }
>>
>> ## retrieve information about a gene; separate some of the info into
>> ## two data frames
>>
>> abaGene <- function(symbol) {
>>     url <- abaUrl("gene", symbol)
>>     xml <- xmlTreeParse(url, useInternal=TRUE)
>>     res <- list(gene_expressions=data.frame(
>>                     structurename=xq(xml, "//structurename"),
>>                     avgdensity=as.numeric(xq(xml, "//avgdensity")),
>>                     avglevel=as.numeric(xq(xml, "//avglevel")),
>>                     row.names=1),
>>                 image_series=data.frame(
>>                   imageseriesdisplayname=xq(
>>                     xml, "//imageseriesdisplayname"),
>>                   age=as.integer(xq(xml, "//age")),
>>                   imageseriesid=as.integer(xq(
>>                     xml, "//imageseriesid")),
>>                   sex=factor(xq(xml, "//sex"))))
>>     free(xml)
>>     res
>> }
>>
>>
>> With this one can
>>
>>> coch <- abaGene("Coch")
>>> coch
>> $gene_expressions
>>                                avgdensity   avglevel
>> Cerebellum                       7.888339  11.977408
>> Cerebral cortex                 52.086498  46.967388
>> Hippocampal region               2.939453   5.846055
>> Hippocampal formation            4.755450   6.896857
>> Hypothalamus                     8.690369   9.506371
>> Lateral septal complex           8.783522   7.551568
>> Midbrain                         9.267348  11.813993
>> Medulla                          7.031233  12.667056
>> Olfactory bulb                  26.625715  34.037346
>> Pons                             3.897534   5.504243
>> Pallidum                         8.595852   9.827189
>> Retrohippocampal region          7.256083   7.999830
>> Striatum-like amygdalar nuclei   7.874489   9.268417
>> Striatum                        74.447227  79.294968
>> Striatum dorsal region         100.000000 100.000000
>> Striatum ventral region          6.137324   4.482482
>> Thalamus                        27.540497  36.667355
>>
>> $image_series
>>   imageseriesdisplayname age imageseriesid sex
>> 1  Coch-Sagittal-06-0609  55      75990683   M
>> 2   Coch-Coronal-05-2779  55      71717614   M
>>
>>
>> For more fun I used the EBImage package and a little more exploration
>>
>> ## Retrieve an image
>>
>> library(EBImage)
>>
>> abaImageSeries <- function(imageseriesid) {
>>     url <- abaUrl("imageseries", imageseriesid)
>>     xml <- xmlTreeParse(url, useInternal=TRUE)
>>     res <- list(images=data.frame(
>>                   imagedisplayname=xq(xml, "//imagedisplayname"),
>>                   downloadImagePath=xq(xml, "//downloadImagePath"),
>>                   stringsAsFactors=FALSE))
>>     free(xml)
>>     res
>> }
>>
>> abaImage <- function(downloadImagePath, zoom=1) {
>>     url <- paste(abaUrl("image"),
>>                  "?zoom=", zoom,
>>                  "&path=", downloadImagePath, sep="")
>>     readImage(url)
>> }
>>
>> and then
>>
>>
>>> series <- abaImageSeries(coch$image_series[1,"imageseriesid"])
>>> series
>> $images
>>    imagedisplayname
>> 1            Coch_2
>> 2           Coch_10
>> 3           Coch_18
>> 4           Coch_26
>> 5           Coch_34
>> 6           Coch_42
>> 7           Coch_50
>> 8           Coch_58
>> 9           Coch_74
>> 10          Coch_82
>> 11          Coch_90
>> 12          Coch_98
>> 13         Coch_106
>> 14         Coch_114
>> 15         Coch_122
>> 16         Coch_130
>> 17         Coch_138
>> 18         Coch_146
>> 19         Coch_154
>>                                                                       
>>  downloadImagePath
>> 1
>> production18/Coch_06-0609_38068/zoomify/primary/0203016726/ 
>> Coch_2_0203016726_A.aff
>> 2
>> production18/Coch_06-0609_38068/zoomify/primary/0203016726/ 
>> Coch_10_0203016726_B.aff
>> 3
>> production18/Coch_06-0609_38068/zoomify/primary/0203016726/ 
>> Coch_18_0203016726_C.aff
>> 4
>> production18/Coch_06-0609_38068/zoomify/primary/0203016726/ 
>> Coch_26_0203016726_D.aff
>> 5
>> production18/Coch_06-0609_38068/zoomify/primary/0203026750/ 
>> Coch_34_0203026750_A.aff
>> 6
>> production18/Coch_06-0609_38068/zoomify/primary/0203026750/ 
>> Coch_42_0203026750_B.aff
>> 7
>> production18/Coch_06-0609_38068/zoomify/primary/0203026750/ 
>> Coch_50_0203026750_C.aff
>> 8
>> production18/Coch_06-0609_38068/zoomify/primary/0203026750/ 
>> Coch_58_0203026750_D.aff
>> 9
>> production18/Coch_06-0609_38068/zoomify/primary/0203034952/ 
>> Coch_74_0203034952_B.aff
>> 10
>> production18/Coch_06-0609_38068/zoomify/primary/0203034952/ 
>> Coch_82_0203034952_C.aff
>> 11
>> production18/Coch_06-0609_38068/zoomify/primary/0203034952/ 
>> Coch_90_0203034952_D.aff
>> 12
>> production18/Coch_06-0609_38068/zoomify/primary/0203044879/ 
>> Coch_98_0203044879_A.aff
>> 13
>> production18/Coch_06-0609_38068/zoomify/primary/0203044879/ 
>> Coch_106_0203044879_B.aff
>> 14
>> production18/Coch_06-0609_38068/zoomify/primary/0203044879/ 
>> Coch_114_0203044879_C.aff
>> 15
>> production18/Coch_06-0609_38068/zoomify/primary/0203044879/ 
>> Coch_122_0203044879_D.aff
>> 16
>> production18/Coch_06-0609_38068/zoomify/primary/0203054806/ 
>> Coch_130_0203054806_A.aff
>> 17
>> production18/Coch_06-0609_38068/zoomify/primary/0203054806/ 
>> Coch_138_0203054806_B.aff
>> 18
>> production18/Coch_06-0609_38068/zoomify/primary/0203054806/ 
>> Coch_146_0203054806_C.aff
>> 19
>> production18/Coch_06-0609_38068/zoomify/primary/0203054806/ 
>> Coch_154_0203054806_D.aff
>>
>>> img <- abaImage(series$images[1,"downloadImagePath"], 2)
>>> img
>>
>> 'Image'
>>   colorMode()   : Grayscale
>>   storage class : numeric 3D array, writable images in range [0..1]
>>   dim()         : 513x414
>>   fileName()    : /tmp/magick-XXxlP9B8
>>   compression() : JPEG
>>   resolution()  : dx = 72.0, dy = 72.0
>>
>> image 1/1:
>>           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
>> [1,] 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922
>> [2,] 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922
>> [3,] 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922
>> [4,] 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922
>> [5,] 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922 0.9803922
>>  ...
>>> display(img)
>>
>> This displays an image at a sort of reasonable resolution; it's then
>> possible to use EBImage to do all kinds of fun manipulations. There's
>> an incredible amount of scope for programmatic manipulation here.
>>
>> The way to construct the queries above, and the structure of the
>> return data, are described on the 'API' page, e.g., following the  
>> link
>> to 'API documetation' on this page:
>>
>> http://community.brain-map.org/confluence/display/DataAPI/Home
>>
>> Martin
>>
>>> Regards, Radek
>>>
>>>
>>> Radek Blatny, MSc.
>>> Institute of Molecular Genetics
>>> Department of Mouse Molecular Genetics (C/O Jiri Forejt)
>>> Czech Academy of Sciences
>>> Videnska 1083
>>> 142 20, Prague
>>> Czech Republic
>>> Tel. (+420) 241 062 260
>>> Fax (+420) 241 062 154
>>> http://www.img.cas.cz/mmg
>>> email: blatny at img.cas.cz
>>> Skype name: blatny
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> Martin Morgan
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M2 B169
>> Phone: (206) 667-2793
>>
>

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioconductor mailing list