[R] Great difference for piecewise linear function between R and SAS

Mon Mar 24 10:48:13 CET 2008

On 24/03/2008 5:23 AM, zhijie zhang wrote:
> Dear Rusers,
>   I am now using  R and SAS to fit the piecewise linear functions, and what
> surprised me is that they have a great differrent result. See below.

You're using different bases.  Compare the predictions, not the 
coefficients.

To see the bases that R uses, do this:

matplot(distance, bs(distance,degree=1,knots=c(16.13,24)))
matplot(y, bs(y,degree=1,knots=c(-0.4357,-0.3202)))

Duncan Murdoch

> #R code--Knots for distance are 16.13 and 24, respectively, and Knots for y
> are -0.4357 and -0.3202
> m.glm<-glm(mark~x+poly(elevation,2)+bs(distance,degree=1,knots=c(16.13,24))
>                 +bs(y,degree=1,knots=c(-0.4357,-0.3202
> )),family=binomial(logit),data=point)
> summary(m.glm)
> 
> Coefficients:
>                                                       Estimate Std. Error z
> value Pr(>|z|)
> (Intercept)                                             12.104      3.138
> 3.857 0.000115 ***
> x                                                       5.815      1.987
> 2.926 0.003433 **
> poly(elevation, 2)1                                     6.654      4.457
> 1.493 0.135444
> poly(elevation, 2)2                                     -6.755      3.441  -
> 1.963 0.049645 *
> bs(distance, degree = 1, knots = c(16.13, 24))1         -1.291      1.139  -
> 1.133 0.257220
> bs(distance, degree = 1, knots = c(16.13, 24))2        -10.348      2.025  -
> 5.110 3.22e-07 ***
> bs(distance, degree = 1, knots = c(16.13, 24))3        -3.530      3.752  -
> 0.941 0.346850
> bs(y, degree = 1, knots = c(-0.4357, -0.3202))1        -6.828      1.836  -
> 3.719 0.000200 ***
> bs(y, degree = 1, knots = c(-0.4357, -0.3202))2        -4.426      1.614  -
> 2.742 0.006105 **
> bs(y, degree = 1, knots = c(-0.4357, -0.3202))3        -11.216      2.861  -
> 3.920 8.86e-05 ***
> 
> #SAS codes
> data b;
>  set a;
>  if distance > 16.13 then d1=1; else d1=0;
>  distance2=d1*(distance - 16.13);
>  if distance > 24 then d2=1; else d2=0;
>  distance3=d2*(distance - 24);
>  if y>-0.4357 then dd1=1; else dd1=0;
>  y2=dd1*(y+0.4357);
>  if y>-0.3202 then dd2=1; else dd2=0;
>  y3=dd2*(y+0.3202);
> run;
> 
> proc logistic descending data=b;
>  model mark =x elevation elevation*elevation distance distance2 distance3 y
> y2 y3;
> run;
> 
> 
>               The LOGISTIC Procedure  Analysis of Maximum Likelihood
> Estimates
> 
>                                                      Standard          Wald
>            Parameter               DF    Estimate       Error
> Chi-Square    Pr > ChiSq
> 
>            Intercept                1     -2.6148      2.1445
> 1.4867
> 0.2227
>            x                        1      5.8146      1.9872
> 8.5615
> 0.0034
>            elevation                1      0.4457      0.1506
> 8.7545
> 0.0031
>            elevation*elevation      1     -0.0279      0.0142
> 3.8533
> 0.0496
>            distance                 1     -0.1091      0.0963
> 1.2836
> 0.2572
>            distance2                1     -1.0418      0.2668
> 15.2424
> <.0001
>            distance3                1      2.8633      0.7555
> 14.3625
> 0.0002
>            y                        1    -16.2032      4.3568
> 13.8314
> 0.0002
>            y2                       1     36.9974     10.3219
> 12.8476
> 0.0003
>            y3                       1    -58.4938     14.0279
> 17.3875
> <.0001
> Q: What is the problem? which one is correct for the piecewise linear
> function?
>   Thanks very much.
> 
>