[BioC] Biomart query in Web interface Vs. biomaRt package?

Steffen sdurinck at lbl.gov
Mon Oct 8 18:13:42 CEST 2007


Hi Jeremie,

Below the answer from the Ensembl helpdesk, in short the 'go' filter 
will retrieve all genes associated with a particular GO identifier and 
the 'biol_process' filter will retrieve all genes associated with a 
particular GO identifier and all of it's children thus explaining why 
one gets more genes when using 'biol_process' compared to 'go' as 
filter. (the Ensembl BioMart Web interface uses 'biol_process' and you 
used 'go' in your biomaRt query)

Cheers,
Steffen

-----

When you query BioMart filtering a specific GO term (GO:0006996, or a
list) you can retrieve all those entries associated to that/those GO
term(s)... But if you filter using a 'Biological process' and then add
an ID, in this case you get all the entries matching that ID and all the
children...

organelle organization and biogenesis [GO:0006996]
autophagic vacuole formation [GO:0000045]
chromosome organization and biogenesis [GO:0051276]
chromosome condensation [GO:0030261]
chromosome decondensation [GO:0051312]
chromosome organization and biogenesis (sensu Bacteria) [GO:0051277]
chromosome organization and biogenesis (sensu Eukaryota) [GO:0007001]
chromosome breakage [GO:0031052]
establishment and/or maintenance of chromatin architecture [GO:0006325]
karyosome formation [GO:0030717]
....     

As seen here:
http://www.ensembl.org/Homo_sapiens/goview?depth=2;query=organelle+organization+and+biogenesis

I hope this explains,
-- Xose M Fernandez (Ensembl User Support)



J.J.P.Lebrec at lumc.nl wrote:
> Hi,
>
> Using the web based Biomart tool (
> http://www.ensembl.org/biomart/martview/ ) in database=Ensembl 46,
> dataset=Homo sapiens Genes (NCBI 36), I have manually extracted all
> unique genes' 'External Gene ID' using GO pathway GO:0006996 as a
> filter. I obtained 1141 unique genes.
>
> I tried to automate the process using the BiomaRt package with the below
> query which only yielded 9 unique genes!
>
>   
>> human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
>>     
> Checking attributes and filters ... ok
>   
>> getBM(attributes = "external_gene_id", filters = "go", values =
>>     
> "GO:0006996", mart = human)
>    external_gene_id
> 1             KIF3A
> 2              HPS3
> 3              HPS3
> 4            DTNBP1
> 5            DTNBP1
> 6             KIF5C
> 7             KIF4A
> 8              HPS1
> 9              HPS6
> 10             HPS6
> 11             HPS6
> 12            KIF25
> 13             HPS4
>   
>> sessionInfo()
>>     
> R version 2.5.1 (2007-06-27) 
> i386-pc-mingw32 
>
> locale:
> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=Fr
> ench_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
>
> attached base packages:
> [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"
> "methods"  
> [7] "base"     
>
> other attached packages:
>  biomaRt    RCurl      XML 
> "1.10.1"  "0.8-0"  "1.9-0" 
>   
>
> I thought the two queries to be equivalent, could you please tell me
> what I am doing wrong here?
>
> Many thanks in advance,
>
> Jeremie
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list