[Rd] Wish list

M. Edward (Ed) Borasky znmeb at cesmail.net
Sun Jan 1 20:08:42 CET 2006


Duncan Temple Lang wrote:

>And while we are on the topic of wishlists...
>Generally (i.e. not directed specifically to Gabor),
>the suggestions are very welcome, but so are contributions.
>And for issues such as making the existing R available on handhelds,
>that is a programming task.
>
Hasn't someone ported R to the Sharp Zaurus, for which both the Linux 
kernel and a more or less complete GNU toolchain exist, plus at least 
two GUI builders? I've forgotten what the compiler version is -- it 
might be back around 2.95.

In any event, one of the Lisps and Maxima have been ported to the 
Zaurus. I'm not sure how well a number crunching application like R 
would run on the Zaurus processor, though -- IIRC the floating point is 
emulated in software. Isn't the same true for Palms and Windows CE PDAs?

>And I draw a large distinction between
>programming and creative research which is based on new concepts and
>paradigms.  The pool of people working in statistical computing research
>is very small. And to a large extent, their time is consumed with
>programming - making the same thing work on multiple platforms,
>correcting documentation, etc. which are good things, but
>not obviously the best use of available research ability and time.
>There are many more topics that are in progress that represent
>changes to what we can do  rather than just to how we do the same thing.
>  
>
I'd much rather have changes to what we can do rather than how we do the 
same thing! As the Perl folks say, "There's more than one way to do it!" 
So keep R and its contributed packages focused on making the first few 
ways to do something new!

>One of the reasons S (R and S-Plus) is where it is now
>is because in Bell Labs, the idea was to be thinking
>5 years ahead and both meeting and directing the needs for the future.
>Because of R's popularity (somewhat related to it being free), there is
>an aspect of development that focuses more on software for statisticians
>to use "right now".
>Obviously, th development is a mixture of both the current and the
>future, but there is less of the future and certainly less of the
>longer term directions that is sacrificed by the need to maintain an
>existing system and be backward-compatible.
>If statistics is to fulfill its potential in this modern IT, we need new
>ideas and research into those new ideas. If we focus on basic
>programming tasks (however complex) and demand usability above concepts,
>we risk losing those whose primary focus is in statistical computing
>research from the field.
>  
>
Amen! Please don't turn R into Perl! The Perl community has statistical 
libraries for the basics. If that's all you want to do, just learn how 
to do it in Perl. The same goes for Python and Ruby. All the scripting 
languages can be used for basic statistical and numeric processing, and 
their communities are adding libraries for more advanced functionality 
all the time.

But no other language/community has the breadth of advanced statistical 
processing that R and its contributed packages have, and no other 
language has the right core semantics to make this kind of computing 
easy, with the possible exception of the newest dialects of Fortran. I 
*could* write a web ecommerce site in R if I wanted to, but why would I? 
I'd do that in PHP or the new Ruby on Rails, because that's what those 
languages were designed to do well!

>While R provides statisticians and stat. comp. researchers with a
>terrific vehicle for doing their respective work, it also acts as
>a constraint for doing anything even moderately new. But much (not all)
>of R is based on innovations from the 1970's, 80's and 90's.   And
>as IT evolves at a terrific pace, to keep up with it, we need to be
>forward looking.
>  
>
Could you elaborate on the nature of the constraints R imposes? 
Obviously there are *time* constraints made necessary by the programming 
tasks and finite number of community members, but are there limits to 
the kinds of scientific/statistical computing thoughts one can think if 
one only uses R and its contributed packages?

>I'll leave it there - for the moment - and go fight off the ants
>that are invading my desk!  While I wrote this down relatively
>rapidly, the ideas have been brewing for a long time. If anyone
>wishes to comment on the theme, I hope they will take a few minutes
>to think about the broad set of issues and tradeoffs.
>  
>
I've been thinking about related issues over the holiday break, mostly 
triggered by Paul Graham's essay on a programming language that would 
last 100 years. The essay will appear on my blog in the near future. 
Meanwhile, I'll add my wish list (and list of things I'd work on in my 
spare time if I had any :) ) for R.

1. An integrated symbolic math capability. I think packaging GiNaC 
(http://freshmeat.net/projects/ginac/) is the logical way to do this. 
GiNaC is a C++ library, and I suspect it could be easily packaged, but I 
haven't tried it yet. If someone is ahead of me on this, I'd like to 
know about it before I attempt it.

2. A good solid discrete time and continuous time Markov chain analyzer 
for use in computer performance analysis. There are quite a few good 
toolsets out there, some with GUIs and some without, but nearly all of 
them have licenses that are not free as in speech. They're freely 
obtainable in the academic community, but not for "commercial use". 
There is one exception, and if I followed the path of integrating an 
existing package, I'd go with Prism (http://www.cs.bham.ac.uk/~dxp/prism/).

3. Along the lines of 2, more "out-of-core" solver capabilities. I don't 
think it's going to be much longer before a "typical scientific 
researcher" in a domain like bioinformatics or computer performance 
analysis will have available a two (physical 64-bit) processor 4GB 
workstation with a terabyte of local disk, plus, of course, access to a 
grid for the "big problems." :) At the moment, I don't have any computer 
performance analysis problems with enough states to require an efficient 
out-of-core solver, but it's bound to happen.
-- 

M. Edward (Ed) Borasky

http://borasky-research.blogspot.com



More information about the R-devel mailing list