[R] Computing the rank of a matrix.

Martin Maechler maechler at stat.math.ethz.ch
Sat Apr 7 16:57:27 CEST 2007


>>>>> "Ravi" == Ravi Varadhan <rvaradhan at jhmi.edu>
>>>>>     on Fri, 6 Apr 2007 12:44:33 -0400 writes:

    Ravi> Hi, qr(A)$rank will work, but just be wary of the
    Ravi> tolerance parameter (default is 1.e-07), since the
    Ravi> rank computation could be sensitive to the tolerance
    Ravi> chosen.

Yes, indeed.

The point is that   rank(<Matrix>) 
is well defined in pure math (linear algebra), as well as 
a "singular matrix" is.

The same typically no longer makes sense as soon as you enter
the real world: A matrix "close to singular" may have to be
treated "as if singular" depending on its "singularity
closeness" {{ learn about the condition number of a matrix }}
and the same issues arise with rank(<matrix>).

Of course, the matlab programmers know all this (and much more),
and indeed, matlab's  rank(A) really is  
    rank(A, tol = tol.default(A))

and is based on the SVD instead of QR decomposition since the
former is said to be more reliable (even though slightly slower).

R's equivalent (with quite a bit of fool-proofing) would be the
following function (assuming correct online documentation of matlab):


rankMat <- function(A, tol = NULL, singValA = svd(A, 0,0)$d)
{
    ## Purpose: rank of a matrix ``as Matlab''
    ## ----------------------------------------------------------------------
    ## Arguments: A: a numerical matrix, maybe non-square
    ##		tol: numerical tolerance (compared to singular values)
    ##	   singValA: vector of non-increasing singular values of A
    ##		     (pass as argument if already known)
    ## ----------------------------------------------------------------------
    ## Author: Martin Maechler, Date:  7 Apr 2007, 16:16
    d <- dim(A)
    stopifnot(length(d) == 2, length(singValA) == min(d),
              diff(singValA) < 0)       # must be sorted decreasingly
    if(is.null(tol))
        tol <- max(d) * .Machine$double.eps * abs(singValA[1])
    else stopifnot(is.numeric(tol), tol >= 0)
    sum(singValA >= tol)
}


A small scale simulation with random matrices,
i.e., things like

  ## ranks of random matrices; here will have 5 all the time:
  table(replicate(1000, rankMat(matrix(rnorm(5*12),5, 12) )))# < 1 sec.

indicates that qr(.)$rank  gives the same typically,
where I assume one should really use

      qr(., tol = .Machine$double.eps, LAPACK = TRUE)$rank

to be closer to Matlab's default tolerance.

Ok, who has time to investigate further? 
Research exercise:

 1>>  Is there a fixed number, say  t0 <- 1e-15 
 1>>    for which  qr(A, tol = t0, LAPACK=TRUE)$rank is 
 1>>   ``optimally close'' to rankMat(A) ?
 
   2>> how easily do you get cases showing svd(.) to more reliable
   2>> than qr(., LAPACK=TRUE)?
  
To solve this in an interesting way, you should probably
investigate classes of "almost rank-deficient" matrices,
and I'd also be interested if you "randomly" ever get matrices A
with  rank(A) <  min(dim(A)) - 1
(unless you construct some columns/rows exactly from earlier
 ones or use all-0 ones)

Martin Maechler, ETH Zurich




    Ravi> Ravi.

    Ravi> -----------------------------------------------------------------
    Ravi> Ravi Varadhan, Ph.D.
    Ravi> Assistant Professor, The Center on Aging and Health
    .......
    Ravi> http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html


    Ravi> --------------------------------------------------------------------



      >> How about

      >>       qr(A)$rank

      >> or perhaps

      >>       qr(A, LAPACK=TRUE)$rank

      >> Cheers,

      >> Andy

                                                                           
                                                                           




    >>     Hi! Maybe this is a silly question, but I need the
    >> column rank (http://en.wikipedia.org/wiki/Rank_matrix)
    >> of a matrix and R function 'rank()' only gives me the
    >> ordering of the elements of my matrix.  How can I
    >> compute the column rank of a matrix? Is there not an R
    >> equivalent to Matlab's 'rank()'?  I've been browsing
    >> for a time now and I can't find anything, so any help
    >> will be greatly appreciated. Best regards!


    >> -- -- Jose Luis Aznarte M.
    >> http://decsai.ugr.es/~jlaznarte Department of Computer
    >> Science and Artificial Intelligence Universidad de
    >> Granada Tel. +34 - 958 - 24 04 67 GRANADA (Spain) Fax:
    >> +34 - 958 - 24 00 79



More information about the R-help mailing list