[R] need help with smooth.spline

Liaw, Andy andy_liaw at merck.com
Fri Mar 5 19:47:35 CET 2004


I looked at rqss() in nprq, as Prof. Koenker suggested, but that doesn't
have a predict() method, so I don't know how you'd get the smooth at values
other than the observed...

The criteria (CV, GCV, etc.) could have multiple local minima for some data,
as Prof. Ripley and Prof. Koenker pointed out, so relying on those
`automatic' selection procedure may not be the best thing to do.
Theoretically as spar (lambda) goes to 0, smooth.spline should linearly
interpolate the data.  I guess the routine could run into numerical problems
before that.

Here's yet another thing to try (thanks to Martin for the `lokern' package):

library(lokerns)
par(mfrow=c(2,4))
for (i in 1:4) {
  plot(dat[[i]]$p, dat[[i]]$t);
  lines(lokerns(dat[[i]]$p, dat[[i]]$t, x.out=seq(25,1000,25)))
  plot(dat[[i]]$p, dat[[i]]$s)
  lines(lokerns(dat[[i]]$p, dat[[i]]$s, x.out=seq(25,1000,25)))
}

Best,
Andy

> From: W. C. Thacker
> 
> roger koenker wrote:
> > 
> > If one repeats the experiments in Craven and Wahba, the paper that
> > "invented" GCV you find, or at least I found, when I tried 
> to do this
> > some years ago, that GCV in about 10%
> > of cases fails rather catastrophically, and this is a 
> fairly innocuous
> > setting. So one way to interpret Brian's comment would be that maybe
> > it is GCV that is failing, and another choice of lambda 
> might do better.
> 
> Making spar large enough avoids the outrageous values but gives a poor
> approximation to much of the data.
> 
> The problems seem to occur in a region where the data indicate the
> ocean should be well-mixed, i.e. the curve should be constant.  What's
> more, the data look as though they have been edited to remove the
> boring repeated values, keeping only the first and last values within
> the mixed layer.
> 
> When constant t and s values are inserted into the data at integer
> values for p within the mixed layer (interpolating by hand),
> smooth.spline() with GCV gives a very different result: much smoother
> and less faithful to the observations.  Much more reasonable.  It
> seems that the problem might be expected when the density of points
> changes abruptly at pretty much the same place where the gradient is
> changing.  Maybe these cases can be captured and treated separately.
> 
> Maybe the total variation penalty methods or some of Andy's
> suggestions (polymars(), mars(), locfit(), denoising with wavelets)
> will work better.  I'll have to explore some packages.
> 
> Thanks to everybody for the gracious help.
> 
> Carlisle
> -- 
> 
> William Carlisle Thacker                            
>                                                     
> Atlantic Oceanographic and Meteorological Laboratory
> 4301 Rickenbacker Causeway, Miami, Florida 33149 USA
> Office: (305) 361-4323           Fax: (305) 361-4392
> 
> "Too many have dispensed with generosity 
>      in order to practice charity."     Albert Camus
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}




More information about the R-help mailing list