[R] need help with smooth.spline

W. C. Thacker Carlisle.Thacker at noaa.gov
Thu Mar 4 16:23:30 CET 2004


"Liaw, Andy" wrote:
Andy,

Is it known that smooth.spline() has a problem handling sharp jumps? 
That is part of the question. It seems to work fine for some sharp
jumps, but I have not yet been able to determine for which cases it
should work well and for which it should fail. 

Maybe at least part of the problem has to do with end-point behavior. 
Or with sampling intervals.

The data are from an archive, so their quality and their sampling
characteristics are far from uniform.  Moreover, the true variability
is non-normal, so recognizing bad data is difficult.  As the objective
is to identify models for estimating salinity from temperature and
pressure, what is important is to avoid outliers, hoping that the
contamination from bad data with believable values is small.

I'll try to get the packages installed and take a look at the function
you mentioned.

In the meantime, do you know of some criteria for recognizing when
smooth.spline might fail?  It seems to work quite well for the bulk of
the data.

Thanks,

Carlisle



> Hi Carlisle,
> 
> If I understand you correctly, the problem is smooth.spline() not handling
> sharp jump(s), right?  If so, it's probably easier to try something that can
> handle such features.  Wavelet `denoising' (as opposed to `smoothing', and
> available in the wavethresh package) is well known for being able to handle
> abrupt changes (very `spatially adaptive').  Other things you might consider
> are mars() in the `mda' package (which fits splines in an adaptive fashion)
> and locfit() in the `locfit' package.  For locfit, you will want to specify
> local smoothing parameter selection, via a call like
> 
>   locfit(..., alpha=c(0, 0, 2), acri="cp")
> 
> You might need to play with the `2' a bit to get the right amount of
> smoothing.  The details are in Loader's book `Local regression and
> Likelihood'.
> 
> HTH,
> Andy
> 
> > From: W. C. Thacker
> >
> > Andy,
> >
> > As the data are often noisy, smoothing splines should be appropriate.
> >
> > The first example profile shows an isothermal (constant temperature)
> > layer in the upper ocean followed by a sharp thermocline (large
> > temperature gradient), but there are relatively few observations
> > defining this sharp transition.  In this case simple linear
> > interpolation works fine, but smooth.spline() with all defaults gives
> > an absolutely absurd value in the isothermal layer.  With all.knots =
> > TRUE, the values in the isothermal layer are much better but still
> > peculiar.
> >
> > Given the sampling and the data, is it possible to get smooth.spline()
> > do better?  If so, would that adversely impact its performance for
> > other cases?  (There are thousands of profiles.)  If not, is there a
> > simp[le way to select cases that smooth.spline() should not be
> > expected to handle, so they can be treated separately?
> >
> > Thanks,
> >
> > Carlisle
> >
> > "Liaw, Andy" wrote:
> > >
> > > If you really want interpolation, should you be using
> > spline() rather than
> > > smooth.spline()?  The later is for smoothing data observed
> > with noise, not
> > > for interpolation.
> > >
> > > Andy
> > >
> > > > From: W. C. Thacker
> > > >
> > > > Dear R listers,
> > > >
> > > > When using smooth.spline to interpolate data, results are
> > generally
> > > > good.  However, some cases produce totally unreasonable results.
> > > >
> > > > The data are values of pressure, temperature, and salinity from a
> > > > probe that is lowered into the ocean, and the objective is to
> > > > interpolate temperature and salinity to specified
> > pressures.  While
> > > > smooth.spline provides excellent values at the observed pressures,
> > > > there are cases when the values at the desired pressures are
> > > > unusable.  A dataframe with four such profiles, indicated
> > by values of
> > > > id, is attached.  My target values for pressure are
> > seq(25,1600,25),
> > > > but 1:500 is also interesting.
> > > >
> > > > Setting all.knots = TRUE helps, but it would be nice to
> > be able to do
> > > > better.
> > > >
> > > > Any suggestions?
> > > >
> > > > Thanks,
> > > >
> > > > Carlisle
> > > >
> > > > > version
> > > >          _
> > > > platform sparc-sun-solaris2.9
> > > > arch     sparc
> > > > os       solaris2.9
> > > > system   sparc, solaris2.9
> > > > status
> > > > major    1
> > > > minor    8.0
> > > > year     2003
> > > > month    10
> > > > day      08
> > > > language R




More information about the R-help mailing list