[R] problems with glm

stephenc at ics.mq.edu.au stephenc at ics.mq.edu.au
Tue Oct 2 05:34:06 CEST 2007


I am having a couple of problems someone may be able to cast some light on.


Question 1:

I am making a logistic model but when i do this:

glm.model = glm(as.factor(form$finished) ~ ., family=binomial,
data=form[1:150000,])

I get this:


Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  :
        variable lengths differ (found for 'barrier')


which is very strange because when I name the predictive factors like this:

glm.model = glm(as.factor(form$finished) ~ form$first + form$second +
form$third + form$barrier, family=binomial, data=form[1:150000,])

It produces a model:

Call:
glm(formula = as.factor(form$finished) ~ form$first + form$second +
    form$third + form$barrier, family = binomial, data = form[1:150000,
    ])

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-3.0884  -0.4932  -0.3951  -0.3006   2.7135

Coefficients:
              Estimate Std. Error  z value Pr(>|z|)
(Intercept)  -2.957831   0.021446 -137.920  < 2e-16 ***
form$first    0.624463   0.078036    8.002 1.22e-15 ***
form$second   0.754057   0.080787    9.334  < 2e-16 ***
form$third    7.718261   0.078532   98.281  < 2e-16 ***
form$barrier -0.058185   0.002175  -26.751  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 144850  on 215213  degrees of freedom
Residual deviance: 133292  on 215209  degrees of freedom
AIC: 133302

Number of Fisher Scoring iterations: 5

Any idea why the first glm call doesn;t work?

Second Question:

Now I want to predict so i do this:

 pred <- predict(glm.model,data=form[150001:20000,],type="response")

but when I try to use it I get this:

> pred <- predict(glm.model,data=form[150001:200000,],type="response")
> t = table(pred,form$finished[150001:200000])
Error in table(pred, form$finished[150001:2e+05]) :
        all arguments must have the same length

and when I do this it confirms my pred is not 50000 long!

> length(pred)
[1] 215214

It doesn't look as though my slection of  rows has worked at all.  Anyone
know why?

Stephen



More information about the R-help mailing list