[R] GAM with the negative binomial distribution: why do predictions no match with original values?

Marine Regis marine.regis at hotmail.fr
Tue Nov 22 22:29:16 CET 2016


Hello,

>From capture data, I would like to assess the effect of longitudinal changes in proportion of forests on abundance of skunks. To test this, I built this GAM where the dependent variable is the number of unique skunks and the independent variables are the X coordinates of the centroids of trapping sites (called "X" in the GAM) and the proportion of forests within the trapping sites (called "prop_forest" in the GAM):

    mod <- gam(nb_unique ~ s(x,prop_forest), offset=log_trap_eff, family=nb(theta=NULL, link="log"), data=succ_capt_skunk, method = "REML", select = TRUE)
    summary(mod)

    Family: Negative Binomial(13.446)
    Link function: log

    Formula:
    nb_unique ~ s(x, prop_forest)

    Parametric coefficients:
                Estimate Std. Error z value Pr(>|z|)
    (Intercept) -2.02095    0.03896  -51.87   <2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

    Approximate significance of smooth terms:
                       edf Ref.df Chi.sq  p-value
    s(x,prop_forest) 3.182     29  17.76 0.000102 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

    R-sq.(adj) =   0.37   Deviance explained =   49%
    -REML = 268.61  Scale est. = 1         n = 58


I built a GAM  for the negative binomial family. When I use the function `predict.gam`, the predictions of capture success from the GAM and the values of capture success from original data are very different. What is the reason for differences occur?

**With GAM:**

    modPred <- predict.gam(mod, se.fit=TRUE,type="response")
    summary(modPred$fit)
       Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
     0.1026  0.1187  0.1333  0.1338  0.1419  0.1795

 **With original data:**

    summary(succ_capt_skunk$nb_unique)
       Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
      17.00   59.00   82.00   81.83  106.80  147.00

The question has already been posted on Cross validated (http://stats.stackexchange.com/questions/247347/gam-with-the-negative-binomial-distribution-why-do-predictions-no-match-with-or) without success.

Thanks a lot for your time.
Have a nice day
Marine


	[[alternative HTML version deleted]]



More information about the R-help mailing list