[R] Difference between gam() and loess().

Sat Mar 21 01:49:22 CET 2009

Kevin E. Thorpe wrote:
> Ravi Varadhan wrote:
>> Good try, Kevin.  But that doesn't seem to do it.
>> set.seed(123)
>>
>> x <- sort(runif(100))
>>
>> y <- sin(4*pi*x) + rnorm(100, sd=0.2)
>>
>> ans.lo2 <- loess(y ~ x, degree=2, span=0.75)
>>
>> ans.gam2 <- gam(y ~ lo(x, degree=2, span=0.75))
>>
>> summary(ans.lo2$fitted - ans.gam2$fitted) # larger differences, about 10%
>>
>> ans.lo1 <- loess(y ~ x, degree=1, span=0.75)
>>
>> ans.gam1 <- gam(y ~ lo(x, degree=1, span=0.75))
>>
>> summary(ans.lo1$fitted - ans.gam1$fitted) # smaller differences, about 
>> 2-5 percent
>>
>> I also tried a number of other things including changing the "family", 
>> and parameters in "loess.control", but to no avail.  I looked at the 
>> Fortran codes from both loess and gam.  They are daunting, to say the 
>> least. They are dense, and there are absolutely no comments 
>> whatsoever.  But one thing is clear - they are using different Fortran 
>> codes.
>>
>> So, the best bet might be to get Trevor Hastie or Bill Cleveland to 
>> help you out. 
>> But, before that:  why is this an issue, Rolf?  Is it important that 
>> these two results be identical?
>>
>> Best,
>> Ravi.
>>
> 
> There was one other thing I found that I shared with Rolf off-list.
> In loess.control() there is an iterations argument which is related
> to the robustness of the estimates.  I would think that could also
> account for tail departures especially.
> 
> I don't gave the gam package installed, so can't test these myself
> at the moment.

Somehow when I read the above Ravi, I missed that you had fiddled with 
loess.contol() AND looked at the Fortran.

I guess one simple parameter change may not quite do it. :-)

Kevin

> 
>>
>>
>> ----- Original Message -----
>> From: "Kevin E. Thorpe" <kevin.thorpe at utoronto.ca>
>> Date: Thursday, March 19, 2009 8:23 pm
>> Subject: Re: [R] Difference between gam() and loess().
>> To: Rolf Turner <r.turner at auckland.ac.nz>
>> Cc: R-help Forum <r-help at r-project.org>
>>
>>
>>> Rolf Turner wrote:
>>>  >  > It seems that in general
>>>  >  >     gam(y~lo(x)) # gam() from the gam package.
>>>  >  > and
>>>  >     loess(y~x)
>>>  >  > give slightly different results (in respect of the 
>>> predicted/fitted
>>>  > values).
>>>  > Most noticeable at the endpoints of the range of x.
>>>  >  > Can anyone enlighten me about the reason for this difference?
>>>  >  > Is it possible to twiddle the control parameters, for either or 
>>> both  > functions,
>>>  > so as to obtain identical results?
>>>  
>>>  There are two obvious differences in the defaults.  In lo() from the 
>>> gam  package, span=0.5 and degree=1 while for loess(), span=0.75 and 
>>> degree=2.
>>>  
>>>  Try gam(y~lo(x,span=0.75,degree=2)) and see if that helps.
>>>  
>>>  Kevin
> 
> 

-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.thorpe at utoronto.ca  Tel: 416.864.5776  Fax: 416.864.6057