[R] open source and R

Liaw, Andy andy_liaw at merck.com
Mon Nov 14 20:38:48 CET 2005


Here comes a not-so-nice one:  Sorry to be blunt, but I think the current
reality is that one's effectiveness in scientific computing is not likely to
be high if s/he can't read C for Fortran code.

The mode of development for new methods, I believe, should be:

- Write it in R (or S-PLUS or Matlab or ...) because one can usually do that
quite quickly.

- Check and make sure the code produces correct result.

- See if the code can be improved for efficiency.  Use the profiling
facility in R to see where the bottlenecks really are, and try to improve
those parts.

- If no significant improvement is possible in R, move only the
time-consuming part of the computation to C/Fortran/C++.

The above mode is not always followed, because many of the packages on CRAN
are simply R interfaces to _existing_ C/Fortran code.  One would be happy to
be able to use them at the R level, but to rewrite the whole thing in R, one
better have _very_ good reason!

For some algorithms, efficient code can be written in pure R, but the
resulting code can be less readable than one written more legibly in C for
Fortran.

Just my $0.02...

Andy

> From: Robert
> 
> Thanks for all the nice discussions. 
>   Though different users have various needs from R, It's 
> always good to stand on the shoulders of giants (as roger 
> said). How far we will see depends our ability to understand 
> what have been done by other languages. 
>   The package written in pure R might be good for education 
> in starting OOP in research but not effective in scientific 
> computing as suggested.
>   
> 
> Ted.Harding at nessie.mcc.ac.uk wrote:
>   On 13-Nov-05 Roger Bivand wrote:
> > On Sun, 13 Nov 2005, Robert wrote:
> > 
> >> If I do not know C or FORTRAN, how can I fully understand 
> the package
> >> or possibly improve it?
> > 
> > By learning enough to see whether that makes a difference for your 
> > purposes. Life is hard, but that's what makes life interesting ...
> > 
> >> Robert.
> >> 
> >> Roger Bivand wrote:
> >> On Sun, 13 Nov 2005, Robert wrote:
> >> 
> >> > Roger Bivand wrote: 
> >> > On Sun, 13 Nov 2005, Robert wrote:
> >> > 
> >> > > It uses FORTRAN code and not in pure R.
> >> > 
> >> > The same applies to deldir - it also includes Fortran. So the
> >> > answer seems to be no, there is no voronoi function only
> >> > written in R.
> >> > 
> >> 
> >> Robert wrote:
> >> 
> >> > 
> >> > I am curious about one thing: since the reason for using r
> >> > is r is a easy-to-learn language and it is good for getting
> >> > more people involved.
> >> >
> >> > Why most of the packages written in r use other languages
> >> > such as FORTRAN's code? I understand some functions have
> >> > already been written in other language or it is faster to
> >> > be implemented in other language.
> >> >
> >> > But my understanding is if the user does not know that
> >> > language (for example, FORTRAN), the package is still a
> >> > black box to him because he can not improve the package and
> >> > can not be involved in the development. 
> >> >
> >> > When I searched the packages of R, I saw many packages with
> >> > duplicated or similar functions. the main difference among
> >> > them are the different functions implemented using other
> >> >languages, which are always a black box to the users. So it
> >> > is very hard for users to believe the package will run
> >> > something they need, let alone getting involved in the
> >> > development. My comments are not to disregard these efforts.
> >> > But it is good to see the packages written in pure R.
> >> > 
> >> 
> >> Although surprisingly much of R is written in R, quite a lot is
> >> written in Fortran and C. One very good reason, apart from
> >> efficiency, is code
> >> re-use
> >> - BLAS and LAPACK among many others are excellent implementations
> >> of what we need for numerical linear algebra. R is very typical
> >> of good scientific software, it tries to avoid re-implementing
> >> functions that are used by the community, are well-supported by
> >> the community, and work. Packages by and large do the same - if
> >> existing software does the required job, package authors attempt
> >> to port that software to R, providing interfaces to underlying
> >> C or Fortran libraries. 
> >> 
> >> It's about standing on the shoulders of giants.
> 
> Those are very strong points. Some comments:
> 
> It would be possible to implement in "pure R" a matrix inversion
> or eigenvalue/vector function, for instance, and I'm sure it would
> be done (if it were done) to very high quality. However, it would
> run like an elephant in quicksands. BLAS and LAPACK have, over the
> years, become highly optimised not just for accuracy and robustness,
> but for speed and efficiency.
> 
> Also, you will hit the "other language" problem sooner or
> later. Robert's complaint is that he does not like black
> boxes. But R itself is a black box. You cannot write R in R,
> all the way down to the bottom. At the bottom is machine
> code, and languages like assember, C, C++, FORTRAN and
> their compilers provide "black box" wrappers for this.
> 
> That is not a whimsical comment either -- all those discussions
> about why 2 - sqrt(2)^2 is not equal to 0 come down to this
> sort of issue. Sooner or later, if you really want to understand
> what is going on, you have to get beneath the shiny smooth
> surface and swim amongst the molecules!
> 
> So, Robert, try to be positive about C and FORTRAN etc., rather
> than feeling put off by the fact that they are yet more things
> to learn and seem to get in the way of understanding how the
> functions work. C and FORTRAN are your friends, as well as
> the R langauge itself, and great deal more friemdly than
> the raw machine code. 
> 
> There is one aspect though where R users are in the cold when
> it comes to C and FORTAN. If you want to understand the function
> 'eigen', say, then you can "?eigen" to learn about its usage.
> You can enter "eigen" to see the R code, and indeed that is
> not too imcomprehensible. But then you find
> 
> .Fortran("ch", n, n, xr, xi, values = dbl.n, 
> !only.values, vectors = xr, ivectors = xi, dbl.n, 
> dbl.n, double(2 * n), ierr = integer(1),
> PACKAGE = "base")
> 
> and similar for "rs", "cg" and "rg". Where's the help for
> these? Nowhere obvious! In fact you have to go to the source
> code, locate the FORTRAN routines, and study these, hoping
> that enough helpful comments have been included to steer
> your study. So it is a much more formidable task, especially
> if you are having to learn the language at the same time.
> 
> Best wishes,
> Ted.
> 
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) 
> Fax-to-email: +44 (0)870 094 0861
> Date: 13-Nov-05 Time: 23:13:58
> ------------------------------ XFMail ------------------------------
>   
> 
> 
> 		
> ---------------------------------
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list