[R] Bug in lowess

Fri Oct 13 14:02:42 CEST 2006

Frank Harrell wrote:

[...]

> Thank you Brian.  It seems that no matter what is the right answer, the
> answer currently returned on my system is clearly wrong.  lowess()$y
> should be constrained to be within range(y).

Really?  That assertion is offered without proof and I am pretty sure is 
incorrect.  Consider

> x <- c(1:10, 20)
> y <- c(1:10, 5) + 0.01*rnorm(11)
> lowess(x,y)
$x
  [1]  1  2  3  4  5  6  7  8  9 10 20

$y
  [1]  0.9983192  1.9969599  2.9960805  3.9948224  4.9944158  5.9959855
  [7]  6.9964400  7.9981434  8.9990607 10.0002567 19.9946117

Remember that lowess is a local *linear* fitting method, and may give zero 
weight to some data points, so (as here) it can extrapolate.

After reading what src/appl/lowess.doc says should happen with zero 
weights, I think the answer given on Frank's system probably is the 
correct one.  Rounding error is determining which of the last two points 
is given zero robustness weight: on a i686 system both of the last two 
are, and on mine only the last is. As far as I can tell in 
infinite-precision arithmetic both would be zero, and hence the value at 
x=120 would be found by extrapolation from those (far) to the left of it.

I am inclined to think that the best course of action is to quit with a 
warning when the MAD of the residuals is effectively zero.  However, we 
need to be careful not to call things 'bugs' that we do not understand 
well enough.  This might be a design error in lowess, but it is not AFAICS 
a bug in the implementation.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595