[R] Saving fits (glm, nls) without data

Tue Sep 7 22:18:54 CEST 2010

David Winsemius <dwinsemius at comcast.net> writes:

> Just tested my theory and it seems to be holding up. Took the example
> on the predict help page, set three of the variable length components
> not needed in the predict operations to NULL and the code still runs
> fine. It does not appear that either predict.glm or predict.lm check
> to see if there are any missing components:

Going through that code, I settled on the following function to remove
all but the most needed components:

   ## Strip down a glm object, until it can only be used for prediction,
   ## but is nice and small.
   strip.glm <- function (f) {
     f.str <- list(coefficients=f$coefficients,
                   family=f$family,
                   terms=f$terms,
                   qr=list(pivot=f$qr$pivot),
                   rank=f$rank,
                   na.action=f$na.action)
     attr(f.str$terms, ".Environment") <- globalenv()
     class(f.str) <- class(f)
     f.str
   }

The truly enormous volume of data was coming in along with the
environment in terms; setting that to the global environment shrunk the
saved file down from 490 MB to 32 MB, which is about the size of the
data matrix.  Then stripping down the qr to just the pivot vector
reduces the size to 2.8 KB.

(Other components of the glm object were also bringing the environment
along with them; I've not experimented to see which were the other
offenders.  terms was the only one that I needed.)

Thanks for your help,

Johann