[Rd] Wish list

Robert Gentleman rgentlem at fhcrc.org
Mon Jan 1 19:37:09 CET 2007



Gabor Grothendieck wrote:
> On 1/1/07, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
>> A few comments thrown in, and some general comments at the bottom.
>>
>> On 1/1/2007 1:28 AM, Gabor Grothendieck wrote:
>>> This is my 2007 New Year wishlist for R features:
>>>
>>> 1. Matrix Multiplication
>>>    Enhance matrix multiplication to work with multidimensional
>>>    arrays such that the last dimension of the first multiplicand
>>>    must equal the first dimension of the second. See:
>>>    https://www.stat.math.ethz.ch/pipermail/r-devel/2006-July/038497.html
>>>
>>> 2. Grid
>>>    - logical-valued function as first arg of grid.edit
>>>    - transparency under Windows (not sure if this involves grid
>>>      or just the Windows graphics device)
>>>    - shading patterns
>>>    - more interactivity features
>>>    - safe way to get name of a grid object, e.g.
>>>         names.vpPath <- names.viewport <- function(x) x$name
>>>    - safe way to get children of a grid object
>>>         getChildren.viewport <- function(x) x$children
>>>      and the order; see:
>>>      https://www.stat.math.ethz.ch/pipermail/r-devel/2005-June/033532.html
>>>    - facility for using a name, viewport or vpPath interchangably
>>>      so that, for example, any of them can be specified in
>>>      in print.trellis(..., draw.in=...) or draw.key(..., vp=...)
>>>
>>> 3. Lattice.
>>>    - make panel functions generic
>>>    - allow print.trellis args to be specified in xyplot, etc.
>>>    - shading patterns (once grid implements them)
>>>    - safe way to access lattice:::getStatus and lattice:::updateList
>>>    - allow name, viewport or vpPath to be specified in draw.in=
>>>      arg of print.trellis (and vp= arg of draw.key?)
>>>    - document parameters, i.e. those output from trellis.par.get()
>>>    - support for groups in histogram
>>>
>>> 4. Higher level Windows clipboard functions.
>>>    Since R 2.3.0 R can handle non-text objects
>>> on the Windows clipboard.  We now need some higher
>>> level functionality that makes use of that
>>> to read in non-text from the clipboard.  For
>>> example, one can select a table on an HTML
>>> page in Internet Explorer and invoke copy
>>> and it will copy it to the clipboard in a
>>> non-text format.  If one invokes paste in
>>> Excel, Excel will automatically detect the
>>> non-text format and copy it in the expected
>>> way so that it appears in Excel one table
>>> cell per Excel cell.
>>>
>>> However, R does not currently
>>> support this level of integration. (Current
>>> workaround is to paste it into Excel and then copy
>>> it back out of Excel.  Excel will insert tabs between
>>> text that is so copied.)
>> R doesn't have HTML parsing built in, so this would be a fairly major
>> addition.  It's a much better idea to write a package to do this.  If
>> the R clipboard support is missing something that such a package would
>> need, that would be a reasonable addition to R.
>>
>>> 6. Allow attributes to be associated with an environment
>>> variable without having them associated with the environment
>>> itself.  This would allow more powerful inheritance in
>>> the case of subclasses of environment.
>>> See:
>>>   https://stat.ethz.ch/pipermail/r-devel/2006-July/038377.html
>>> and subsequent postings in that thread.  Any package
>>> that uses the list(env = whatever) idiom to define
>>> objects could make use of this.
>> As I said in that thread, this is not a good suggestion.
> 
> Yes, but I disagree with that assessment and I am not the
> only one.

   Nor is Duncan alone in his.

   best wishes
     Robert

> 
>>> 7. documentation standards for packages
>>>    - NEWS/ChangeLog (also should be accessible from CRAN page for package
>>>      and should be included in built version of package)
>>>    - package?mypackage
>> I don't understand the second part of this.  We already support a
>> package?mypackage topic, and recommend that people put it in.  Are you
>> saying packages should be rejected if they don't?  That's an awful lot
>> of work you're asking other people to do.
> 
> There should be some guidelines as to what goes into mypackage-package.Rd .
> 
>>> 8. Need to be able to distinguish between ordinary missing values
>>> and structurally missing ones.
>> I think this is something that you need to do in a different way.  There
>> are tons of possible semantics for what NA should mean.  I don't think
>> this should be made more complicated for everyone.
>>
> 
> Although one does not want to overcomplicate things the fact is that
> there are two issues here: structural and non-structural and trying to
> force them into a single construct is not simplifying -- rather it
> fails to model
> what is required adequately.
> 
> 
>>> 9. bidirectional pipes in Windows
>>>
>>> 10. Create a log updated at a regular frequency (daily or real time)
>>> that tracks all changes on CRAN, e.g.
>>>
>>>       Date(GMT)           Package Version Action
>>>       2006-09-20 21:22:01 mypkg   1.0.1   new
>>>       2006-09-20 22:00:23 mypkg2  0.2.1   updated
>>>
>>> 11. make integrate generic.  Ryacas could use that.
>>>
>>> 12. Remove all R CMD dependencies on the find.exe command.  find is a built
>>>     in command in Windows and having find.exe on my path causes
>>>     problems with other programs.
>> A simpler fix for this would be for you to define a wrapper for R CMD
>> that installed the R tools path before executing, and uninstalls it
>> afterwards.  But this is unnecessary for most people, because
>> Microsoft's find.exe is pretty rarely used.
>>
> 
> Anyone who uses batch files will use it quite a bit.  It certainly causes
> me problems on an ongoing basis and is an unacceptable conflict in
> my opinion.
> 
> I realize that its not entirely of R's doing but it would be best if R did not
> make it worse by requiring the use of find.
> 
>>> 13. Make upper/lower case of simplify/SIMPLIFY consistent on all
>>>     apply commands and add a simplify= arg to by.
>> It would have been good not to introduce the inconsistency years ago,
>> but it's too late to change now.
> 
> Its not too late to add it to by().
> 
> Also note that the gsubfn package does have a workaround for this.  In gsubfn
> one can preface any R function with fn$ and if that is done then the function
> can have a simplify= argument which fn$ intercepts and processes.  e.g.
> 
> library(gsubfn)
> fn$by(CO2[4:5], CO2[2], x ~ coef(lm(uptake ~ ., x)), simplify = rbind)
> 
> fn$ can also interpret formulas as functions (and does quasi perl interpolation
> in strings) so the formula in the third argument is regarded to be the same
> as the anonymous function:  function(x) coef(lm(uptake ~., x)) .
> 
> More examples are in the gsubfn vignette.
> 
>>> 14. better reporting of location of errors and warnings in R CMD check.
>> This is in the works, but probably not for 2.5.x.
> 
> Great.  This will be very welcome.
> 
>>> 15. tcl tile library (needs tcl 8.5 or to be compiled in with 8.4)
>>>
>>> 16. extend aggregate to allow vector valued functions:
>>>     aggregate(CO2[4:5], CO2[1:2], function(x) c(mean = mean(x), sd = sd(x)))
>>>     [summaryBy in doBy package and cast in reshape package can already
>>>     do similar things but this seems sufficiently fundamental that it
>>>     ought to be in the base of R]
>>>
>>> 17. All OSes should support input= arg of system.
>>>
>>> My previous New Year wishlists are here:
>>>
>>> https://www.stat.math.ethz.ch/pipermail/r-devel/2006-January/035949.html
>>> https://www.stat.math.ethz.ch/pipermail/r-help/2005-January/061984.html
>>> https://www.stat.math.ethz.ch/pipermail/r-devel/2004-January/028465.html
>> To anyone still reading:
>>
>> Many of the suggestions above would improve R, but they're unlikely to
>> happen unless someone volunteers to do them.  I'd suggest picking
>> whichever one of these or some other list that you think is the highest
>> priority, and post a specific proposal to this list about how to do it.
>>  If you get a negative response or no response, move on to the next
>> one, or put it into a contributed package instead.
>>
> 
> I think it works best when contributors develop their software in
> contributed packages since it avoids squabbles with the core group.
> 
> The core group can then integrate these into R itself if it seems warranted.
> 
>> When you make the proposal, consider how much work you're asking other
>> people to do, and how much you're volunteering to do yourself.  If
>> you're asking others to do a lot, then the suggestion had better be
>> really valuable to *them*.
>>
> 
> The implementation effort should not be a significant consideration in
> generating wish lists.    What should be considered is what is really needed.
> Its better to know what you need and then later decide whether to implement
> it or not than to suppress articulating the need.  Otherwise the development
> is driven by what is easy to do rather than what is needed.
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the R-devel mailing list