[Rd] Suggested change to cor.test.default

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Thu Jul 29 16:20:20 CEST 2021


The stats:::cor.test.default function has these tests near the start:

     if (length(x) != length(y))
         stop("'x' and 'y' must have the same length")
     if (!is.numeric(x))
         stop("'x' must be a numeric vector")
     if (!is.numeric(y))
         stop("'y' must be a numeric vector")

I'd like to suggest putting the first test in last place instead, which 
would make some user errors easier to diagnose.  For example, if I 
misspell one of the column names, I get

   df <- data.frame(x = 1:10, y = 1:10)
   cor.test(df$X, df$y)
   #> Error in cor.test.default(df$X, df$y): 'x' and 'y' must have the 
same length

because df$X is NULL.  It would be more obvious what went wrong if the 
error said

   Error in cor.test.default(df$X, df$y):  'x' must be a numeric vector

Duncan Murdoch

P.S. An even more friendly error message would give the actual 
expression for x instead, i.e.

   Error in cor.test.default(df$X, df$y):  'df$X' is not a numeric vector

but that's not the style of error used in most stats functions.



More information about the R-devel mailing list