[R] possible bug in function 'var' in R 2.7.2?

Martin Maechler maechler at stat.math.ethz.ch
Fri Oct 3 09:23:38 CEST 2008


>>>>> "BB" == Ben Bolker <bolker at ufl.edu>
>>>>>     on Thu, 2 Oct 2008 13:07:02 +0000 (UTC) writes:

    BB> <klaus.steenberg.larsen <at> risoe.dk> writes:
    >> 
    >> Dear R-Help,
    >> 
    >> I have used R2.6.0 until I recently installed also R2.7.2 (see details below)
    >> 
    >> In R 2.6.0, the following script using the function 'var' (cor(stats)):
    >> 
    >> x.test <- c(NA, NA, NA, NA)
    >> 
    >> var(x.test, na.rm=T)        
    >> 
    >> gives the output: 
    >> 
    >> NA
    >> 
    >> In R2.7.2 the output of the same script generates an error message and stops R:
    >> 
    >> 'Error in var(x.test, na.rm = T) : no complete element pairs'
    >> 
    >> R2.7.2 can handle it if there is just one non-NA value in the list but not if
    BB> they are all NA.
    >> 
    >> I prefer the output of 2.6.0. Is this a bug in 2.7.2 or is it a deliberate
    BB> change compare to previous versions?
    >> Or is there a way to make R2.7.2 give NA as output?
    >> 
    >> Thank you for any helo/comments! 
    >> 
    >> Best regards,
    >> 
    >> Klaus

    BB> This is a deliberate change, but the behavior will
    BB> (more or less) revert in version 2.8.0.  

    BB> From the NEWS file for 2.7 (in bug fixes):

    BB> o	co[rv](use = "complete.obs") now always gives an error if there
    BB> are no complete cases: they used to give NA if
    BB> method = "pearson" but an error for the other two methods.
    BB> (Note that this is pretty arbitrary, but zero-length vectors
    BB> always give an error so it is at least consistent.)

    BB> Since sd(na.rm=TRUE) and var(na.rm=TRUE) both call cov(use =
    BB> "complete.obs"), this applies also to them.

    BB> cor(use="pair") used to give diagonal 1 even if the variable
    BB> was completely missing for the rank methods but NA for the
    BB> Pearson method: it now gives NA in all cases.

    BB> cor(use="pair") for the rank methods gave a matrix result with
    BB> dimensions > 0 even if one of the inputs had 0 columns.


    BB> From the NEWS file for the development version

    BB> o   var(),cov(),cor() etc now by default (when 'use' is not specified)
    BB> return NA in many cases where they signalled an error before.

    BB> I don't know of a really easy way to make the behavior revert, perhaps
    BB> the easiest workaround is to make a 'my.var' function that first
    BB> tests if(all(is.na(x))) -- if you want to live really dangerously
    BB> you could even call it 'var' and have it mask the built-in function,
    BB> but that's probably a bad idea.

Thank you, Ben.

Yes masking var() is a bad idea;
a much better (and much less error-prone) idea would be to install
R 2.8.0 alpha  even now. 
It will become 'beta' early next week.

We are asking the R community to please install and use
pre-release versions of R  (if you can / are allowed to)
at least from beta onwards, and report problems you see early on 
*before* the final release.
That's one thing you can give back to the R developers who
provide R freely to you.

Best regards,
Martin Maechler
ETH Zurich and R Core Team



More information about the R-help mailing list