[R] Predictions on training set shorter than training set

William Dunlap wdunlap at tibco.com
Thu Apr 23 22:42:01 CEST 2015


Are there missing values in your data?  If so, try adding
the argument
   na.action = na.exclude
to your original call to glm or lm.  It is like the default
na.omit except that it records which rows were omitted
(because they contained missing values) and fills in
the corresponding entries in the predictions, residuals, etc.
with NA's.

You can also set
   options(na.action = "na.exclude")
to make it the default na.action in lm() and similar functions.




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Apr 23, 2015 at 10:23 AM, Mark Drummond <mark at markdrummond.ca>
wrote:

> Hi all,
>
> Given a simple logistic regression on a training data set using glm,
> the number of predicted values is less than the number of observations
> in the training set:
>
> > fit.train.pred <- predict(fit, type = "response")
> > nrow(train)
> [1] 62660
> > length(fit.train.pred)
> [1] 58152
> >
>
> As a relative newcomer, I've run lots of simple glm, CART etc. models
> but this is the first time I have seen this happen.
>
> Is this a common issue and is there a fix? An option to predict() perhaps?
>
> --
> Cheers, Mark
>
> Mark Drummond
> mark at markdrummond.ca
>
> When I get sad, I stop being sad and be Awesome instead. TRUE STORY.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list