[BioC] limma topTable question

Tue Apr 5 09:23:39 CEST 2005

See comments below

On Mon, Apr 04, 2005 at 09:37:05PM -0700, Cyrus Harmon wrote:
> 
> On Apr 4, 2005, at 8:14 PM, Sean Davis wrote:
> 
> >The beauty of R and BioConductor is that YOU can modify option M:
> >
> >If you type topTable (without any parentheses or arguments), you will 
> >see that topTable is a VERY thin wrapper around a call to the function 
> >toptable. If you do the same for toptable (type it without any 
> >parentheses or arguments), you will see about half way down the 
> >definition of the function a line that looks like:
> >
> >ord <- switch(sort.by, M=(order(abs(M), decreasing=TRUE), ....
> 
> Indeed. Thanks for pointing this out. This is certainly quicker than my 
> usual approach of finding the digging up the source file and finding 
> the function definition in situ. Not quite as handy as M-. in SLIME (an 
> emacs-based IDE/debugger for common lisp) but very nice.
> 
> >You can make a copy of toptable called my.toptable by:
> >
> >my.toptable <- toptable
> >
> >Then, change the order for 'M' in my.toptable to be:
> >
> >ord <- switch(sort.by, M=(order(M, decreasing=TRUE),....
> 
> What do you mean by "change the order ... in my.toptable? I get the 
> obvious part, but the question is more of a mechanical one, having done 
> my.toptable <- toptable? How do I edit the ord <- line? Clearly, I can 
> put the function definition in an emacs/ESS buffer and eval the 
> function def, but is there a better way to do this? The REPL is very 
> nice, but the model of eval'ing function defs or regions one at a time

What about C-x C-f: eval function?

> in emacs buffers seems somewhat cloddish. Back to the parallel, with 
> slime, is there a nice way to make this change take effect? It seems 
> that the problem is magnified if I'm trying to develop an R extension 
> as I have to do R CMD INSTALL in order to get the change to take effect 
> in the place I eventually want to use it. I realize I've gone totally 
> off of the topic from the original question, but if the preferred model 
> of tweaking packages like this is as you've described, I feel like I 
> must be missing something about the mechanics of writing and eval'ing R 
> code.

I am a bit unsure what your problem is. There are several approaches:
  1) make a my.toptable and then edit topTable to call this function 
instead of toptable
  2) make a function called toptable in your global environment. This 
will override the function in topTable, unless limma is using 
namespaces.

If your problem is more on the line of editing the toptable function, 
you can do
  fix(toptable)
or
  toptable <- edit(toptable)
(with a suitable choice of options(editor = ...))
or you can do
  dump(toptable)
(creates a file called toptable.R, which you cab edit and then source). 

In your scripts you can just source this function.

> >If you make a copy of topTable called my.topTable by:
> >
> >my.topTable <- topTable
> >
> >and change it so that it calls my.toptable instead of toptable, you 
> >now have your own function called my.topTable that does what you want. 
> > You can of course make any other changes to the functions that you 
> >want--add your own options, etc.  The simple task of looking at 
> >others' code is quite powerful when dealing with issues like the one 
> >you bring up.  I would encourage all who use bioconductor and R to try 
> >it whenever possible; even if it doesn't all make sense, it is a very 
> >good way to learn.
> 
> Sure, but I'd hope that package maintainers were open to well-written 
> and documented patches that added the functionality to the library 
> itself, rather than having tons of local copies of possibly out of date 
> lying code. I suppose, going back to my previous question, I could 
> store my.toptable.diff and apply the diff on the fly and iff the patch 
> succeeds eval a modified my.toptable, but that seems a bit hokey.

They generally are, but in my own opinion, a developer has to achive a 
balance between enough options/arguments in a function and too many. It 
is not certain that a particular twist on a function is something which 
will be widely enough used to warrant a host of new arguments (I am not 
saying this particular case is one of these).

 > > >
> >>(Combining my question and my gripe, a sort by "m" that didn't do 
> >>abs(M) would seem useful to me, but perhaps I'm missing something.)
> >
> >If you are not typically a programmer in bioconductor, this seems like 
> >a good chance to try your hand at it.  If you get something that you 
> >like better than what Gordon has offered in Limma, send him the 
> >modified code. He, like most of bioconductor/R developers, is 
> >remarkably receptive and responsive to criticism/improvements.
> 
> This is great to hear. I feel like I'm having trouble figuring out how 
> to develop mid-size projects in R. Clearly, typing R commands straight 
> into the REPL is a nice way to play around, and clearly the R extension 
> mechanism is a great way to package up R extensions for distribution, 
> but for developing my own mid-size R packages, I'm still unclear on 
> reasonable idioms for putting together my own mid-size projects. So far 
> the best I've come up with is local packages that still need to be R 
> CMD INSTALL'ed, but to a local directory, and then 
> library(lib.loc=<some-nice-local-path>) in my scripts, but this topic 
> has probably been previously covered ad nauseum. Time to go digging 
> through the docs and r-help archives.

I think the "local package" paradigm works fine for me. Else keep an R 
file you can source. You do not need to re-install packages for small 
changes to take effect while you develop a package, you can just 
evaluate the appropriate code-snippet.

In addition I tend to use a Makefile if I find I re-install often.

Kasper