[R] Re: R-1.1.0 is released : GUI

Timothy H. Keitt keitt at nceas.ucsb.edu
Fri Jun 16 23:29:10 CEST 2000


cstrato at EUnet.at wrote:
> 
> 
> I am using S-Plus and R for some time now and find it very elegant and enjoy
> programming in it.

Me too.

> 
> However, I have two problems, which other people have also mentioned in
> r-help a couple of times: I have very large datasets to work with and thus I
> need
> i, the ability to handle large data sets, and
> ii, speed, speed, speed.

I think what you want is a scalable solution, not speed per se. 
Scalability is more important than speed.  Who cares if your code runs
really fast on toy problems?  A scalable solution is one that performs
adequately on small problems, but does not become prohibitive when you
exceed small, fast resource pools, like cache, main memory, etc.  The
only way to achieve scalability is by design.  You have to create
powerful and flexible abstractions so that your algorithms do not depend
on details of the implementation.  In other words, the same code should
work whether your data is in cache, main memory, in a database,
distributed across a network, and so on.  Unfortunately, these
abstractions do not exist in the C code underlying R, nor in any(?) of
the external C/FORTRAN code in the R packages.

> My first question is:
> Since S/R is a full featured language it would be great to have a native
> compiler, so that I could write stand-alone programs which profit from the
> full speed, and ability to handle large data-sets, of a stand-alone application.
> Wouldn´t this be an option to consider?

This will only help in the non-vectorized parts of your code.  If most
of what you do involves matrix and vector operations, compiling the S/R
source will not give much improvement.  Also, compiling code that is
fundamentally not scalable to large problems will not really give you
what you want.

> Do I understand it write that the bindings to Tcl/Tk offer just this possibility?

That's my take on it, although I prefer gnome/glade.

> 
> I have the feeling the Java implementation will not solve my two main
> problems, speed and data-size.
>

This is more of a design issue than a language issue.  Having said that,
I also prefer C++ to Java.
 
> In my very personal opinion it would be great to have one of the
> following two options:
> a, a native R/S compiler

This will not give you what you want for reasons outlined above.

> b, an implementation of the R functionality as C++ classes.
 
I am strongly in favor of this solution.  Model the existing data
structures in the S/R language as abstract types (again my preference is
C++), then use these as the computational engine under the interpreted R
code.  There are some very good object oriented numerical packages
available that could be used in specific implementations.  Take a look
at:

	http://www.acl.lanl.gov/pooma/
	http://www.lsc.nd.edu/research/
	http://sourceware.cygnus.com/gsl/

My 2 cents...

T.

-- 
Timothy H. Keitt
National Center for Ecological Analysis and Synthesis
735 State Street, Suite 300, Santa Barbara, CA 93101
Phone: 805-892-2519, FAX: 805-892-2510
http://www.nceas.ucsb.edu/~keitt/
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list