[Rd] ave returns wrong data type (PR#13664)

brenbarn at brenbarn.net brenbarn at brenbarn.net
Sun Apr 19 23:05:11 CEST 2009


Full_Name: Brendan Barnwell
Version: 2.9.0
OS: Windows XP Pro
Submission from: (NULL) (71.102.131.29)


   The ave() function returns an incorrect datatype.  Specifically, ave(x, g, f)
always returns a vector with the same mode as x, rather than using the mode of
the vector returned by f.  Observe:

> x
 [1] "A" "B" "C" "A" "B" "C" "A" "B" "C" "A" "B" "C" "A" "B" "C" "A" "B" "C" "A"
"B" "C" "A" "B" "C" "A" "B" "C" "A" "B" "C"
> g
 [1] "X" "Y" "X" "Y" "X" "Y" "X" "Y" "X" "Y" "X" "Y" "X" "Y" "X" "Y" "X" "Y" "X"
"Y" "X" "Y" "X" "Y" "X" "Y" "X" "Y" "X" "Y"
> ave(x, g, FUN=length)
 [1] "15" "15" "15" "15" "15" "15" "15" "15" "15" "15" "15" "15" "15" "15" "15"
"15" "15" "15" "15" "15" "15" "15" "15" "15"
[25] "15" "15" "15" "15" "15" "15"

   Even though the length() function returns a vector of integers, ave()
inappropriately converts this to a character vector.  The bug is due to this
line in the definition of ave(): 

split(x, g) <- lapply(split(x, g), FUN)

   By sticking the result of the lapply back into the original argument x, it
coerces that result to the type of that argument.  This contradicts the
documentation, which says that the value of ave() is "a numeric vector".  I
would suggest that this documentation itself doesn't describe the desired
behavior.  The result vector should be of the type returned by FUN (just as it
is for tapply).  Otherwise it is impossible to use ave() to compute summary
statistics whose type differs from that of the argument.



More information about the R-devel mailing list