[Rd] R + C + Lapack toy regression example

Douglas Bates bates at stat.wisc.edu
Fri Sep 25 22:57:03 CEST 2009


On Thu, Sep 24, 2009 at 2:09 PM, Vinh Nguyen <vqnguyen at uci.edu> wrote:
> On Thu, Sep 24, 2009 at 11:49 AM, Simon Urbanek
> <simon.urbanek at r-project.org> wrote:
>> As Doug pointed out you don't want to be using .C(). As for matrix
>> manipulations - they are usually done directly on the objects which are
>> vectors stored in column-major order.
>
> i meant .Call().   also, sorry for the poor word choice, i meant
> matrix computations, not matrix manipulations.  i guess i just want
> some recommendations for future references on what C library to use if
> i were doing something like re-implement linear regression
> (hypothetically speaking).  i assumed it would be lapack.  is this the
> recommended approach?

Yes, Lapack would be the recommended code for numerical linear
algebra.  If happens to be Fortran code but it is callable from C/C++.
 One big advantage of Lapack (besides its having been written by some
of the top people in the field and being very well tested) is that it
uses the BLAS (Basic Linear Algebra Subroutines) as levels 1, 2 and 3.

However, implementing linear models software is more subtle than just
the numerical part.  In fact, least squares calculations are (I think)
the only part of R where Linpack is used instead of  Lapack, by
default.  Most of the time the symbolic analysis of a linear model
formula done in R produces a model matrix with full column rank but
under some circumstances it doesn't.  (The easiest such case to
explain is a two-way layout with missing cells.)  The way these cases
are handled is to use a special pivoting algorithm in the
decomposition.  Neither Linpack nor Lapack provided a pivoting scheme
so early in the R project Ross modified the Linpack version of the QR
decomposition to use such a scheme.  It is still the one in use.



More information about the R-devel mailing list