[BioC] Using htmlpage

john seers (IFR) john.seers at bbsrc.ac.uk
Fri Feb 13 17:20:09 CET 2009


Hi Jim


>As much as I would like to take credit, this is all Martin Morgan.

Ah, but you made it happen, so don't shrink from taking some credit.
(But thanks Martin Morgan as well).


>As I said in an earlier email, I can only help if you tell me _exactly_

>what you want to do. I'm not smart enough to parse what you have said 
>above or in earlier messages. If someone asked me 'What exactly does 
>John Seers want?', I would only be able to say something about you 
>wanting to do something with some data and end up with links to
Ensembl. 
>And that is seriously not enough to go on.

I understand your problem but it is a little unfair to pile it all back
on me. The problem is I do not know exactly what I want to do. Or
perhaps I do but I do not know if what I want to do is sensible in your
world. Remember I am dealing with uncertainty as well and have a user
who does not know exactly what they want to do. If I had an exact
specification I would give it to you. But I do not know the full
capability of www.ensembl.org , what queries/links are available and the
best way to do it. I think you are closer to it than me so you are more
likely to know the best way forward. I have solved what I wanted to do
with a couple of clunky workarounds. If on the way I have helped with
stimulating some ideas that lead to some improvements then great. If you
think my suggestions are not the way forward then fine. If I could get
rid of both my clunky workarounds then excellent. If I can only get rid
of one then still pretty good.


> about you 
>wanting to do something with some data and end up with links to
Ensembl. 
>And that is seriously not enough to go on.

Well, why not? (That is not sarcastic). That just about sums up what I
thought was needed. That is free up the code from specific repositories
and link/query structures. What I was saying is that the data needs to
get through to the helper function and once there you do not have to
worry about the nature of the links. The helper function builds the link
and you do not care how. There is complete flexibility and you can build
any link/query you want. You can supply a basic list of helper functions
(as you do now) and they can be added to by the user as needed. I
suggested perhaps it could be done with the "..." notation (or something
similar) but that did not appeal to you. I do not know why this does not
appeal to you but you are the expert in this area. 

The specific example I gave was this:

>>One of the specific problems I had was I had an exon id as the id. But
the Ensembl query had to use the >>transcript name for the query string.


>>Something like:

>>out[i]
<-paste("http://www.ensembl.org/Mus_musculus/Transcript/Exons?t=",
ids[i], sep = "")

>>So I did not want the ids[i] but a way of getting to my transcript id
column to generate t=transcript_id.

I am not sure how much more exact I could be in at least giving an
example of my (uncertain) problems.

	

>OK. Given an Ensembl transcript ID (say, ENST00000405446), do you not 
>want to end up here?

>http://www.ensembl.org/Homo_sapiens/Search/Summary?species=Homo_sapiens
;idx=;q=ENST00000405446

>If not, then where might you want to be?

Well no, that is straightforward with the existing code. This is one of
the idiocyncrancies of Ensembl (and presumably all database sites have
them) and I have not found if there is another way to do what I want. My
problem is I have the Exon id (for example ENSE00000012345 with an E for
Exon) and there is no direct Ensembl exon query that I have been able to
find (yet). The query looks something like (from above):
 
http://www.ensembl.org/Mus_musculus/Transcript/Exons?t=ENST00000405446

(N.B. This will not work without a valid transcript). 

So I need to pull in the transcript id (ENST00000405446) from another
column but I cannot get to it from the helper function. My clunky
workaround is to duplicate the transcript column and build the exon
query on that. So the user finds the exon they are interested in,
backtracks to transcript column 2 and clicks. I would like them to click
on the exon id in the exon column. (Transcript column 1 naturally has
the standard transcript query).

But that approach on a specific query misses the point. I was trying not
to bring it down to specific examples because I think it could be
completely flexible. But what do I know? I may be completely wrong and
this is not the philosophy/design of repositories and htmlpage and
ensembl and I have missed the point of how it should be done. I am not
in a position to be that prescriptive of how it should be done. Perhaps
what you have in the code already can do what I want but I just cannot
see it.

However I hope that does not sound ungrateful for your efforts because I
am very grateful. I have not looked at it yet but I am sure you have
improved the useability of modifying getQueryLink etc which was the main
difficulty I had. 

>You will have to run the devel version of R, as the changes are only in

>the devel version of BioC.

Oh blow! I guess I cannot resist forever. :)


Regards and thanks


John  







 
---


-----Original Message-----
From: James W. MacDonald [mailto:jmacdon at med.umich.edu] 
Sent: 13 February 2009 14:36
To: john seers (IFR)
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Using htmlpage

Hi John,

john seers (IFR) wrote:
> Hello Jim
> 
> Thanks for this. Done impressively quickly.
> 
>> makes smart use of an environment to hold the getQuery4XX()
functions,
> and you 
>> can add new ones to the environment, thereby bypassing the namespace.
> 
> It looks like you have found a neat way to do this and to make it a
lot
> easier to use. I look forward to giving it a go. I expect I will be
> curious enough to look at the code and see how it was done.

As much as I would like to take credit, this is all Martin Morgan.

> 
>> You can create links to Ensembl, using the repository 'ens'. Since 
>> AFAICT Ensembl requires a species to be part of the URI, you will
have 
>> to pass an additional argument 'species' to htmlpage(). The form of
> this 
>> argument is species="Homo_sapiens" for e.g, Human.
> 
> Ensembl is a good addition. But I am not sure if this particular
> addition addresses the specific problem I described. But I will have a
> look and report back to you. That involved an Exon search that needs
the
> ensembl transcript id. My workaround for that was to duplicate my
> transcript column and perform the exon search on that. Perhaps there
is
> another way to do it - I will have another go.  

OK. Given an Ensembl transcript ID (say, ENST00000405446), do you not 
want to end up here?

http://www.ensembl.org/Homo_sapiens/Search/Summary?species=Homo_sapiens;
idx=;q=ENST00000405446

If not, then where might you want to be?

As I said in an earlier email, I can only help if you tell me _exactly_ 
what you want to do. I'm not smart enough to parse what you have said 
above or in earlier messages. If someone asked me 'What exactly does 
John Seers want?', I would only be able to say something about you 
wanting to do something with some data and end up with links to Ensembl.

And that is seriously not enough to go on.

> 
> I guess this is, or will be soon, available as a new version of
> annotate. I will have a look and see if it is there. If not perhaps I
> will have to give in and follow your recommendation to run from
source. 

You will have to run the devel version of R, as the changes are only in 
the devel version of BioC. The windows binary _would_ have been up there

today if I hadn't left two uncommented underscores in the help page for 
htmlpage() (which oddly enough doesn't seem to bother either Linux or 
Mac OS). So if you want to compile from source, you can start today.

If you want a windows binary you will have to wait a day or two.

Best,

Jim


> 
> Thanks again. I will give you some feedback on your efforts.
> 
> Regards
> 
> 
> John
>   
> 
>  
> ---
> 
> -----Original Message-----
> From: James W. MacDonald [mailto:jmacdon at med.umich.edu] 
> Sent: 12 February 2009 18:39
> To: john seers (IFR)
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Using htmlpage
> 
> Hi John,
> 
> Thanks to some help from Martin Morgan (thanks Martin!), the devel 
> version of annotate now has two additions.
> 
> 1.) You can create links to Ensembl, using the repository 'ens'. Since

> AFAICT Ensembl requires a species to be part of the URI, you will have

> to pass an additional argument 'species' to htmlpage(). The form of
this
> 
> argument is species="Homo_sapiens" for e.g, Human. If you forget to
pass
> 
> the argument it will bomb out with an error.
> 
> This should work with ENSG, ENST, or ENSP identifiers, but let me know

> if you have problems.
> 
> 2.) You can also now create links for arbitrary websites. There are 
> three new functions, setRepository(), getRepositories() and 
> clearRepository() that you can use to set up, look at, and remove 
> repositories, respectively. This is all thanks to Martin - he makes 
> smart use of an environment to hold the getQuery4XX() functions, and
you
> 
> can add new ones to the environment, thereby bypassing the namespace.
> 
> The function you add should have similar form to any of the 
> getQuery4XX() functions. Just write up the function, and use 
> setRepository() to put it into the environment. There are some
examples 
> in the help page for these functions that will hopefully get anyone 
> interested in this started. Again, let me know if it is not clear, or
if
> 
> there are any problems.
> 
> Best,
> 
> Jim
> 
> 
> 
> john seers (IFR) wrote:
>> Hi Jim
>>
>>
>> Searching on ellipsis and pass through I got the following.
>>
>> "The second argument to boot(), called 'statistic', can be
>> any user-written function you want to cook up, with additional
>> arguments being passed to it through the '...' mechanism after
>> all of the named arguments. (See: `R-intro `Writing your own
>> functions `The ellipsis argument for details.)"
>>
>> Looking at the boot code shows the user function "statistic" in the
>> signature and various calls to statistic of the form 
>>
>> t0 <- statistic(data, original, rep(1, sum(m)), ...)
>>
>>
>> function (data, statistic, R, sim = "ordinary", stype = "i", 
>>     strata = rep(1, n), L = NULL, m = 0, weights = NULL, ran.gen =
>> function(d, 
>>         p) d, mle = NULL, simple = FALSE, ...) 
>> {
>>
>> This is sort of what I had in mind. (Though I have not found the code
> to
>> pass through the "rest" of the variables.). No parsing needed.
>>
>>
>>> However, the ellipsis is designed to pass arbitrary variables to
>> underlying code.
>>
>> I think that is what I am suggesting needs to be done - pass through
>> arbitrary variables. Perhaps a problem you have is htmlpage calls
>> getCells which calls getQueryLink which calls the helper function(s).
> 
>> Not sure if this is any help to you.
>>
>> Regards
>>
>> John
>>
>>
>>
>>
>>
>>
>>
>>
>>   
>>  
>> ---
>>

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662



More information about the Bioconductor mailing list