[R] help interpreting output?

Tue Jan 7 16:44:03 CET 2003

Mike  -

I observe that you have dropped by far the most significant single
predictor in going from the first to the second model.  If I had
to guess, I would guess that the remaining predictor variables are
either binary indicator variables or else have only a handful of
distinct values.  Can't dignose much more than that in the absence
of the actual data.

If it were my problem, I would plot the response against each
predictor, also residuals vs. fitted values for each model, and
do some graphical data analysis to diagnose what's going on.
I encourage you to do this for yourself.

						-  tom blackwell  -

On Mon, 6 Jan 2003, Michael F. Palopoli wrote:

> Dear R experts,
>
> I'm hoping someone can help me to interpret the results of building
> gam's with mgcv in R.
>
> Below are summaries of two gam's based on the same dataset.  The first
> gam (named "gam.mod") has six predictor variables.  The second gam
> (named "gam.mod2") is exactly the same except it is missing one of the
> predictor variables.  What is confusing me is the estimated defrees of
> freedom for each of the splines in the second model....
>
> ________________
>
>  > summary.gam(mod.gam)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> INT ~ s(IGS) + s(L2E) + s(TED) + s(PSD) + s(OPD) + s(GED)
>
> Parametric coefficients:
>            Estimate  std. err.    t ratio    Pr(>|t|)
> constant     302.32      5.192      58.23    < 2.22e-16
>
> Approximate significance of smooth terms:
>               edf       chi.sq     p-value
> s(IGS)      4.254       58.308     9.5524e-12
> s(L2E)          1       8.7673     0.0030668
> s(TED)          1       8.3915     0.0037697
> s(PSD)          1       6.0234     0.014118
> s(OPD)      2.289       12.745     0.0024349
> s(GED)      3.791       152.68     < 2.22e-16
>
> R-sq.(adj) = 0.885   Deviance explained = 91.1%
> GCV score = 2124.9   Scale est. = 1617.3    n = 60
>
> ________________
>
>  >summary.gam(mod.gam2)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> INT ~ s(IGS) + s(L2E) + s(TED) + s(PSD) + s(OPD)
>
> Parametric coefficients:
>            Estimate  std. err.    t ratio    Pr(>|t|)
> constant     302.32  4.736e-14  6.384e+15    < 2.22e-16
>
> Approximate significance of smooth terms:
>               edf       chi.sq     p-value
> s(IGS)  1.757e-05   1.3524e+09     < 2.22e-16
> s(L2E)   0.009991      0.21394     0.6437
> s(TED)  2.945e-05   1.4913e+07     < 2.22e-16
> s(PSD)  2.566e-05   6.5495e+06     < 2.22e-16
> s(OPD)  5.023e-05   3.2332e+07     < 2.22e-16
>
> R-sq.(adj) = 0.645   Deviance explained = 64.5%
> GCV score = 7489.7   Scale est. = 6069.7    n = 60
>
>
> ________________
>
>
> Any suggestions about either (1) what went wrong with the second model?
>  or (2) how the heck do I interpet these results?
>
> Thanks,
>
> Mike.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
>