[R] Genmod in SAS vs. glm in R

Ajay ohri ohri2007 at gmail.com
Wed Sep 10 11:52:08 CEST 2008


Whats the R equivalent for Proc logistic in SAS ? Is there a stepwise
method there ?

How to create scoring models in R , for larger datasets (200 mb), Is
there a way to compress and use datasets (like options compress=yes;)

Ajay

On Wed, Sep 10, 2008 at 11:12 AM, Peter Dalgaard
<p.dalgaard at biostat.ku.dk> wrote:
> Rolf Turner wrote:
>>
>> For one thing your call to glm() is wrong --- didn't you notice the
>> warning messages about ``non-integer #successes in a binomial glm!''?
>>
>> You need to do either:
>>
>> glm(r/k ~ x, family=binomial(link='cloglog'), data=bin_data,
>> offset=log(y), weights=k)
>>
>> or:
>>
>> glm(cbind(r,k-r) ~ x, family=binomial(link='cloglog'), data=bin_data,
>> offset=log(y))
>>
>> You get the same answer with either, but this answer still does not agree
>> with your
>> SAS results.  Perhaps you have an error in your SAS syntax as well.  I
>> wouldn't know.
>
> The data created in the data step are not those used in the analysis.
> Changing to
>
> data nelson;
> <etc>
>
> gives the same result as  R on the versions I have available:
>
>                                                 Analysis Of Parameter
> Estimates
>
>                                                    Standard     Wald 95%
> Confidence       Chi-
>                     Parameter    DF    Estimate       Error           Limits
>            Square    Pr > ChiSq
>
>                     Intercept     1     -3.5866      2.2413     -7.9795
>  0.8064       2.56        0.1096
>                     x             1      0.9544      2.8362     -4.6046
>  6.5133       0.11        0.7365
>                     Scale         0      1.0000      0.0000      1.0000
>  1.0000
>
> and
> Call:
> glm(formula = r/k ~ x, family = binomial(link = "cloglog"), data = bin_data,
>   weights = k, offset = log(y))
>
> Deviance Residuals:     1        2        3        4  0.5407  -0.9448
>  -1.0727   0.7585
> Coefficients:
>           Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -3.5866     2.2413  -1.600    0.110
> x             0.9544     2.8362   0.336    0.736
>
>
>>
>>    cheers,
>>
>>        Rolf Turner
>>
>>    On 10/09/2008, at 10:37 AM, sandsky wrote:
>>
>>>
>>> Hello,
>>>
>>> I have different results from these two softwares for a simple binomial
>>> GLM
>>> problem.
>>>>
>>>> From Genmod in SAS: LogLikelihood=-4.75, coeff(intercept)=-3.59,
>>>
>>> coeff(x)=0.95
>>>>
>>>> From glm in R: LogLikelihood=-0.94, coeff(intercept)=-3.99,
>>>> coeff(x)=1.36
>>>
>>> Is there anyone tell me what I did wrong?
>>>
>>> Here are the code and results,
>>>
>>> 1) SAS Genmod:
>>>
>>> % r: # of failure
>>> % k: size of a risk set
>>>
>>> data bin_data;
>>> input r k y x;
>>> os=log(y);
>>> cards;
>>> 1    3    5    0.5
>>> 0    2    5    0.5
>>> 0    2    4    1.0
>>> 1    2    4    1.0
>>> ;
>>> proc genmod data=nelson;
>>>    model r/k = x /     dist = binomial     link =cloglog   offset = os ;
>>>
>>>     <Results from SAS>
>>>
>>>    Log Likelihood                       -4.7514
>>>
>>>    Parameter    DF    Estimate       Error           Limits
>>> Square    Pr > ChiSq
>>>
>>>    Intercept     1     -3.6652      1.9875     -7.5605      0.2302
>>> 3.40        0.0652
>>>    x                1      0.8926      2.4900     -3.9877      5.7728
>>> 0.13        0.7200
>>>    Scale          0      1.0000      0.0000      1.0000      1.0000
>>>
>>>
>>>
>>> 2) glm in R
>>>
>>> bin_data <-
>>>
>>> data.frame(cbind(y=c(5,5,4,4),r=c(1,0,0,1),k=c(3,2,2,2),x=c(0.5,0.5,1.0,1.0)))
>>> glm(r/k ~ x, family=binomial(link='cloglog'), data=bin_data,
>>> offset=log(y))
>>>
>>>     <Results from R>
>>>    Coefficients:
>>>    (Intercept)            x
>>>        -3.991        1.358
>>>
>>>    'log Lik.' -0.9400073 (df=2)
>>
>> ######################################################################
>> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> --
>  O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Regards,

Ajay Ohri
http://tinyurl.com/liajayohri


More information about the R-help mailing list