[BioC] how to retrieve data in list using getBM in biomaRt package?

James W. MacDonald jmacdon at med.umich.edu
Thu Aug 10 23:17:14 CEST 2006


Hi Steffen and Shiliang,

It is usually best to keep these sorts of conversations on the list, so 
others can learn. Since most of this conversation looks useful, I am 
linking back to the BioC list.

To answer your question, the problem with your 'repository' argument is 
that you are using 'go' as one of the arguments. Unfortunately, 
htmlpage() is not capable of doing the hyperlinks to GO right now. If 
you look at the 'HOWTO Get HTML Output' vignette in the biomaRt package, 
it should help. Basically, you want to do two calls to getBM(), one for 
those things that htmlpage() _can_ create hyperlinks for (Entrez Gene, 
SwissProt, RefSeq), and another for those things that htmlpage() 
_cannot_ create hyperlinks for (GO, GO description, etc).

Your should then use

repository <- list("en","sp", "gb")

genelist <- getBM(<arguments here for _linkable_ identifiers>)
otherdata <- getBM(<arguments here for other things like GO>)

then if you also want to add expression values you can do something like

otherdata <- c(otherdata, <expression data>)

then do the call to htmlpage() like in the HOWTO vignette.

Best,

Jim



Steffen Durinck wrote:
> The following biomaRt query should give you GO information starting from 
> you ll object with entrez gene identifiers.
> 
> annotlist <- getBM(attributes = c("entrezgene","go","go_description"), 
> filters ="entrezgene", values = ll, mart = mart,  na.value 
> ="&nbsp;",output="list")
> 
> I've CC'd  James MacDonald he probably can help us with your htmlpage 
> problem.
> James do you know how to use the repository feature that is used with 
> htmlpage?
> 
> best,
> Steffen
> 
> 
> swang wrote:
> 
>> Hi, Dr.Durinck:
>>  
>> I tried to do both, I tried to retrieve the updated GO identifiers for 
>> Rosetta probe and create html page output fro hyperlink:
>>  
>> Entrez Gene     GO     SwissProt     RefSeq
>>
>>  
>> So I am using htmlpage function from annotate package, but for some of 
>> reason there is no clear instruction for repository(htmlpage function) 
>> in the annotate vignetts, I just wonder if you know that.
>> Sorry for bothering you
>> thanks
>>  
>> Shiliang
>>
>>  
>> On 8/10/06, *Steffen Durinck* <durincks at mail.nih.gov 
>> <mailto:durincks at mail.nih.gov>> wrote:
>>
>>     Hi Shiliang,
>>
>>     I'm not sure if I understand what you are trying to do.
>>     Are you trying to retrieve GO identifiers using biomaRt?
>>     or
>>     Are you trying to create html pages by using the output of biomaRt in
>>     the htmlpage command of the annotate package?
>>
>>     best,
>>     Steffen
>>
>>     swang wrote:
>>     > Dr. Durinck:
>>     >
>>     > I used your mart link:
>>     > mart = useMart("ensembl", dataset="hsapiens_gene_ensembl",
>>     mysql=TRUE)
>>     > it works and I tried to put GO id into genelist so that I can
>>     get the
>>     > hyperlink fot GO id
>>     > repository <- list("en","go","sp", "gb")
>>     > but it seems not work. I just wonder that what I should put in the
>>     > repository. I checked your vignnets and help documents, no specific
>>     > instruction for repository.
>>     >
>>     > thanks
>>     >
>>     > Shiliang
>>     >
>>     >
>>     > On 8/10/06, *Steffen Durinck* <durincks at mail.nih.gov
>>     <mailto:durincks at mail.nih.gov>
>>     > <mailto:durincks at mail.nih.gov <mailto:durincks at mail.nih.gov>>>
>>     wrote:
>>     >
>>     >     Hi Shiliang,
>>     >
>>     >     That's why we recommend to use biomaRt in  MySQL mode
>>     when  you want a
>>     >     list as output.
>>     >     From your error I can see you are trying to retrieve a list
>>     as output,
>>     >     using biomaRt in webservice mode.
>>     >     How did you use  the useMart function?
>>     >     You should do:
>>     >
>>     >     mart = useMart("ensembl", dataset="hsapiens_gene_ensembl",
>>     >     mysql=TRUE)
>>     >
>>     >     best,
>>     >     Steffen
>>     >
>>     >     swang wrote:
>>     >     > Dr.Durinck:
>>     >     >
>>     >     > I already tried  this before and sometimes it works but
>>     sometimes it
>>     >     > didn't.
>>     >     > I tried it again a few minutes ago, it worked,but I tried the
>>     >     > following, it didn't . I believe there some bugs inside
>>     >     > > annotlist <- getBM(attributes = c("description",
>>     >     "chromosome_name",
>>     >     > "chromosome_location","go_description", "band"), filters =
>>     >     > "entrezgene", values = ll, mart = mart,  na.value =
>>     >     > "&nbsp;",output="list")
>>     >     > Error in postForm(paste(mart at host <mailto:mart at host
>>     <mailto:mart at host>
>>     >     <mailto: mart at host <mailto:mart at host>>>, "?", sep = ""),
>>     >     > query = xmlQuery) :
>>     >     >         couldn't connect to host
>>     >     >
>>     >     > > sessionInfo()
>>     >     > R version 2.4.0 Under development (unstable) (2006-07-25
>>     r38698)
>>     >     > i386-pc-mingw32
>>     >     >
>>     >     > locale:
>>     >     > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>     >     > States.1252;LC_MONETARY=English_United
>>     >     > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>     >     >
>>     >     > attached base packages:
>>     >     > [1] "tools"     "methods"   "stats"        "graphics"  
>> "grDevices"
>>     >     > "utils"     "datasets"  "base"
>>     >     >
>>     >     > other attached packages:
>>     >     >   Rosetta    RMySQL       DBI     limma  annotate   annaffy
>>     >     > KEGG        GO   Biobase   biomaRt     RCurl       XML
>>     >     >   "1.0.0"   "0.5-7"  "0.1-10"   " 2.7.5"  " 1.11.2"   "1.5.0"
>>     >     > "1.12.0"  "1.12.0" "1.11.22"   " 1.7.5"   "0.6-3"  "0.99-8"
>>     >     > >
>>     >     >
>>     >     > my biomaRt is the newest version and I have already
>>     download and
>>     >     build
>>     >     > today's biomaRt source code and tried. it is the same.
>>     >     >
>>     >     >
>>     >     >
>>     >     >
>>     >     >
>>     >     >
>>     >     > On 8/10/06, *Steffen Durinck* <durincks at mail.nih.gov
>>     <mailto:durincks at mail.nih.gov>
>>     >     <mailto:durincks at mail.nih.gov <mailto:durincks at mail.nih.gov>>
>>     >     > <mailto:durincks at mail.nih.gov
>>     <mailto:durincks at mail.nih.gov> <mailto: durincks at mail.nih.gov
>>     <mailto:durincks at mail.nih.gov>>>>
>>     >     wrote:
>>     >     >
>>     >     >     Hi Shiliang,
>>     >     >
>>     >     >     It looks like you forgot to specify the output
>>     argument as a
>>     >     list.
>>     >     >     Try:
>>     >     >
>>     >     >     genelistM <- getBM(attributes =
>>     >     >     c("entrezgene","go","uniprot_swissprot_accession",
>>     >     "refseq_dna"),
>>     >     >     filters =
>>     >     >     "entrezgene",values = ll, mart = mart, na.value 
>> ="&nbsp;",
>>     >     >     output="list")
>>     >     >
>>     >     >     best,
>>     >     >     Steffen
>>     >     >
>>     >     >     swang wrote:
>>     >     >     > Hi,Dr. Durinck:
>>     >     >     >
>>     >     >     >
>>     >     >     > I do need to get a list back from getBM function in
>>     biomaRt
>>     >     >     package, I read
>>     >     >     > your feedback about getBM when the probe is long.
>>     >     >     >
>>     >     >
>>     >        
>> http://article.gmane.org/gmane.science.biology.informatics.conductor/9172/match=biomart+rmysql+mode 
>>
>>     >     >     <
>>     >        
>> http://article.gmane.org/gmane.science.biology.informatics.conductor/9172/match=biomart+rmysql+mode> 
>>
>>     >     >     > but can you tell me how to get a list back using bomaRt
>>     >     RMySQL mode?
>>     >     >     > I have a 250 length entrezgene id and try to get back
>>     >     >     > genelistM <- getBM(attributes =
>>     >     >     > c("entrezgene","go","uniprot_swissprot_accession",
>>     >     >     "refseq_dna"), filters =
>>     >     >     > "entrezgene",
>>     >     >     > values = ll, mart = mart, na.value =
>>     >     >     > "&nbsp;")
>>     >     >     >
>>     >     >     > length(ll)
>>     >     >     > 250
>>     >     >     > ll <- getLL(probe,data = "Rosetta")
>>     >     >     >
>>     >     >     > I cannot get back in list, otherwise it will tell me
>>     wrong.
>>     >     >     >
>>     >     >     > Thanks
>>     >     >     >
>>     >     >     > Shiliang
>>     >     >     >
>>     >     >     >       [[alternative HTML version deleted]]
>>     >     >     >
>>     >     >     > _______________________________________________
>>     >     >     > Bioconductor mailing list
>>     >     >     > Bioconductor at stat.math.ethz.ch
>>     <mailto:Bioconductor at stat.math.ethz.ch>
>>     >     <mailto:Bioconductor at stat.math.ethz.ch
>>     <mailto:Bioconductor at stat.math.ethz.ch>>
>>     >     >     <mailto:Bioconductor at stat.math.ethz.ch
>>     <mailto:Bioconductor at stat.math.ethz.ch>
>>     >     <mailto:Bioconductor at stat.math.ethz.ch
>>     <mailto:Bioconductor at stat.math.ethz.ch>>>
>>     >     >     > https://stat.ethz.ch/mailman/listinfo/bioconductor
>>     <https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>     >     <https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>     >     >     > Search the archives:
>>     >     >
>>     >        
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>     >     >     >
>>     >     >
>>     >     >
>>     >
>>     >
>>     >     --
>>     >     Steffen Durinck, Ph.D.
>>     >
>>     >     Oncogenomics Section
>>     >     Pediatric Oncology Branch
>>     >     National Cancer Institute, National Institutes of Health
>>     >     URL: http://home.ccr.cancer.gov/oncology/oncogenomics/
>>     <http://home.ccr.cancer.gov/oncology/oncogenomics/>
>>     >
>>     >     Phone: 301-402-8103
>>     >     Address:
>>     >     Advanced Technology Center,
>>     >     8717 Grovemont Circle
>>     >     Gaithersburg, MD 20877
>>     >
>>     >
>>
>>
>>     --
>>     Steffen Durinck, Ph.D.
>>
>>     Oncogenomics Section
>>     Pediatric Oncology Branch
>>     National Cancer Institute, National Institutes of Health
>>     URL: http://home.ccr.cancer.gov/oncology/oncogenomics/
>>     <http://home.ccr.cancer.gov/oncology/oncogenomics/>
>>
>>     Phone: 301-402-8103
>>     Address:
>>     Advanced Technology Center,
>>     8717 Grovemont Circle
>>     Gaithersburg, MD 20877
>>
>>
> 
> 


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list