[R] questions regarding spline functions

Dylan Beaudette dylan.beaudette at gmail.com
Tue Aug 1 20:06:16 CEST 2006


Posting reply to the list as well:

On Tuesday 01 August 2006 07:02, Myers, Brent wrote:
> Dylan,
>
> We are working on the same problem. I'm shocked that I hadn't seen the
> paper you posted and I am most grateful for that. I try to pay attention
> to McBratney's work too!
>

Hi Brent, glad to hear from a fellow soil scientist.
I originally found the McBratney paper in the book: Upscaling and Downscaling 
Methods for Environmental Research.

> The originator of the area-preserving spline is DeBoor, in A Practical
> Guide to Splines, Chapter 8. The DeBoor algorithm is available in the
> Matlab Spline Toolkit, which he wrote. Too pricey for me however. I have
> been unsuccessful implementing it thus far in S-Plus, the reference is
> not very clear in the construction of the coefficient matrix. I can send
> you my notes on that. I've paused on that for now, more below. Notice
> that these are quadratic splines (due to the number of constraints) and
> that they are not so 'flexible' in representing soils data.

Interesting. I will lookup the original author, and see if I can make any 
sense of it. This is not critical, merely a bit of initial exploration on 
horizon depth function characterizations for large data sets. 

> There are several other functions available from a CERN library
> available also in their Physics Analyst Workstation software (GPL). I
> haven't tried them. Realize also that some of the various density
> estimation algorithms in S/R are spline fits of this nature.

I will have to read up on those...

> I also considered the possibility of boundary data to define slope at
> some of the knots, but only about 1/3 of my data have good horizon
> descriptions. Linear stairstep with sloped lines across boundaries are
> trivial, but are a marked improvement over mid-horizon data.

I too have seen similar results. However, I cannot quite get the bs() function 
to use the "knots" that I have in mind: note that this may be an operator 
error... Does it make sense to use the horizon boundaries as knots?

> I am sure you are thinking about this...one big problem with area
> preserving splines is that they will not represent the true distribution
> of the soil property within the horizon if there are peaks. You
> undoubtedly realize that since the vector of pedogenesis is anisotropic,
> and that splines cannot represent this anisotropy. This is a real
> problem for me in that it is critical to identify the location of minima
> and maxima within the soil profile, more important than the property
> value in my case. Their location is grossly distorted by a spline fit to
> a mid-horizon data point. The mid-point data is better. I am taking a
> separate approach to solve that problem, and then fit the area
> preserving spline with knots defined at these locations. We need a
> general algorithm to handle this data!!!! It would be very useful for
> the soil science community!

Indeed. If you wouldn't mind, I would be interested in hearing how this 
progresses. While we are not in a position to develop such a technique, 
outside of something rather simple, we sure would be able to put it to use.

A quick summary in code and with an attached image of the various methods I 
have tried thus far, with one simple, and one complicated clay depth profile:

require(splines)
#example of two pedons on one plot:
z.1 <- c(0,2,18,24,68,160,170,192,200)
z.2 <- c(0,3,14,26,70,108,145,170,226,240)

x.1 <- mid(z.1)
x.2 <- mid(z.2)

#clay pct
y.1 <- c(0,1,2,2,4,7,6,1)
y.2 <- c(0,5,3,3,3,2,27,3,5)

fm.1 <- lm(y.1 ~ bs(x.1, df=5) )
fm.2 <- lm(y.2 ~ bs(x.2, df=5) )

isfm.1 <- interpSpline(y.1 ~ x.1)
isfm.2 <- interpSpline(y.2 ~ x.2)

new_x.1 <-  seq(range(x.1)[1], range(x.1)[2], len = 200)
new_x.2 <-  seq(range(x.2)[1], range(x.2)[2], len = 200)

new_y.1 <- predict(fm.1, data.frame(x.1=new_x.1) )
new_y.2 <- predict(fm.2, data.frame(x.2=new_x.2) )

#interpSpline() method
#note different predict() syntax
ispline_y.1 <- predict(isfm.1, new_x.1)
ispline_y.2 <- predict(isfm.2, new_x.2)

par(mfrow=c(2,1))

#simple pedon example
plot(y.1 ~ x.1, xlab="Depth", ylab="Percent Clay", type="b", pch=16, lwd=2, 
main="Simple Pedon")
#spline examples
lines(spline(y.1 ~ x.1), col="blue", lty=2)
lines(new_x.1, new_y.1, col='red', lty=2)
lines(ispline_y.1, col='green', lty=2)

legend(8,6, legend=c('soil data', 'splines()', 'interpSpline()', 'bs()'), 
col=c('black', 'blue', 'red', 'green'), lty=c(1,1,1,1), lwd=c(2,1,1,1), 
cex=0.7)

#complex pedon:
plot(y.2 ~ x.2, xlab="Depth", ylab="Percent Clay", type="b", pch=16, lwd=2, 
main="Complex Pedon")
#spline examples
lines(spline(y.2 ~ x.2), col="blue", lty=2)
lines(new_x.2, new_y.2, col='red', lty=2)
lines(ispline_y.2, col='green', lty=2)

legend(8.4,24, legend=c('soil data', 'splines()', 'interpSpline()', 'bs()'), 
col=c('black', 'blue', 'red', 'green'), lty=c(1,1,1,1), lwd=c(2,1,1,1), 
cex=0.7)


Cheers,

Dylan


> Brent
>
> D. Brenton Myers
> Graduate Fellow
> University of Missouri
> Soil Environmental and Atmospheric Sciences
> 269 Agriculture Engineering Building
> Columbia, MO 65211
> Work: (573) 882-1146
> Home: (573) 446-6856
> Cell: (573) 881-6855
> email: myersdb at missouri.edu
>
>
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Dylan Beaudette
> Sent: Monday, July 31, 2006 6:43 PM
> To: RHELP
> Subject: [R] questions regarding spline functions
>
> Greetings,
>
> A couple general questions regarding the use of splines to interpolate
> depth
> profile data.
>
> Here is an example of a set of depths, with associated attributes for a
> given
> soil profile, along  with a function for calculating midpoints from a
> set of
> soil horizon boundaries:
>
> #calculate midpoints:
> mid <- function(x) {
> for( i in 1:length(x)) {
>  if( i > 1) {
>    a[i] = (x[i] - x[i-1]) / 2 + x[i-1]
>   }
>  }
> #reurn the results
> a[which(!is.na(a))]
> }
>
> #horizon depth bounds
> z <- c(0,2,18,24,68,160,170,192,200)
>
> #horizon midpoints, associated with horizon attribute
> x <- mid(z)
>
> #clay pct
> y <- c(0,1,2,2,4,7,6,1)
>
> #plot them
> plot(y ~ x, xlab="Depth", ylab="Percent Clay", type="s")
> points(y ~ x, cex=0.5, pch=16)
>
> These point pairs usually represent a trend with depth, which I would
> like to
> model with splines - or some similar approach, as they have been found
> to
> work better than other methods such as a fitted polynomial.
>
> Using the B Spline function from the 'splines' package, it is possible
> to fit
> a model of some property with depth based on the bs() function:
>
> #natual, B-Splines
> library(splines)
>
> #fit a b-spline model:
> fm <- lm(y ~ bs(x, df=5) )
>
> I am able to predict a soil property with depth, at unsampled locations
> with
> this model with:
>
> new_x <-  seq(range(x)[1], range(x)[2], len = 200)
>
> #predict attribute at unsampled depths:
> new_y <- predict(fm, data.frame(x=new_x) )
>
> #plot the predicted attribute at the unsampled depths
> lines(new_x, new_y, col='red')
>
> This tends to work fairly well (see attached), but I am wondering if I
> can use
> the 'knots' parameter in the bs() function for incorporation of horizon
> boundary information into the creation of the spline. Moreover, it would
> be
> nice if the spline-based model 'fm' would predict a set of values with
> similar mean and range as the original data points: i.e
>
> #summary of clay values from original data:
> summary(y)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>   0.000   1.000   2.000   2.875   4.500   7.00
>
> #see above
> summary(new_y)
>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
> -0.05786  2.09500  3.13200  3.62800  5.17100  7.08700
>
>
> This is based on an article I read :
> http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V67-3WWRDYY-3
> &_user=4421&_handle=V-WA-A-W-AU-MsSAYWW-UUA-U-AACZEDZWBC-AACVCCDUBC-BZAY
> UEWB-AU-U&_fmt=summary&_coverDate=08%2F31%2F1999&_rdoc=3&_orig=browse&_s
> rch=%23toc%235807%231999%23999089998%23108393!&_cdi=5807&view=c&_acct=C0
> 00059598&_version=1&_urlVersion=0&_userid=4421&md5=488f1e114d8d64265ff65
> 506e9587e71
>
> where the author talks about a so-called 'equal-area quadratic smoothing
>
> spline' approach to describing a soil property depth function.
> Unfortunately
> the author did not provide sample code....
>
> Any thoughts / input would be greatly appreciated!
>
> Cheers,

-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spline2.png
Type: image/png
Size: 33301 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-help/attachments/20060801/4f119153/attachment.png 


More information about the R-help mailing list