[R] warning with glm.predict, wrong number of data rows

Charles Berry ccberry at ucsd.edu
Thu May 3 18:19:00 CEST 2012


carol white <wht_crl <at> yahoo.com> writes:

> 
> Hi,
> I split a data set into two partitions (80 and 42), use the first as the
training set in glm and the second as
> testing set in glm predict. But when I call glm.predict, I get the warning
message: 
> 
> Warning message:
> 'newdata' had 42 rows but variable(s) found have 80 rows 
> ---------------------

[snip]

The warning correctly diagnoses the problem.

The posting guide asks for a 'reproducible example', but you did not give us one.

There is one below. 

Note what happens when predict() tries to reconstruct the variable 'x[1:4]'
as dictated by the formula.

How many elements can 'x[1:4]' have when newdata has (say) nrowsNew?

Use the subset argument to select a subset of observations.


> y <- sample(factor(1:2),80,repl=T)
> y <- sample(factor(1:2),5,repl=T)
> x <- 1:4
> fit <- glm( y[1:4] ~ x[1:4], family = binomial)
> fit

Call:  glm(formula = y[1:4] ~ x[1:4], family = binomial)

Coefficients:
(Intercept)       x[1:4]  
 -1.110e-16    0.000e+00  

Degrees of Freedom: 3 Total (i.e. Null);  2 Residual
Null Deviance:      5.545 
Residual Deviance: 5.545        AIC: 9.545 
> predict(fit,newdata=data.frame(x=1:2))
            1             2             3             4 
-1.110223e-16 -1.110223e-16            NA            NA 
Warning message:
'newdata' had 2 rows but variable(s) found have 4 rows 
> predict(fit,newdata=data.frame(x=1:5))
            1             2             3             4 
-1.110223e-16 -1.110223e-16 -1.110223e-16 -1.110223e-16 
Warning message:
'newdata' had 5 rows but variable(s) found have 4 rows 
>


HTH,

Chuck

[rest deleted]



More information about the R-help mailing list