[R] glm: offset

(Ted Harding) Ted.Harding at manchester.ac.uk
Mon Mar 3 08:51:35 CET 2008


On 03-Mar-08 03:19:01, Wensui Liu wrote:
> HI, John,
> my understanding is that you should use log(...) instead of its
> original scale. Below is the logic in the case of poisson reg.
> log(y / offset) = x'b
> => log(y) - log(offset) = x'b
> => log(y) = x'b + log(offset)

Well, this is where it gets interesting!
The above statement of the "logic" begs the question (i.e. assumes
the answer).

I would go according to the general interpretation of "offset"
in LM and GLM modelling -- an "offset" is

  "a quantitative variable whose regression coefficient
   is known to be 1"
  [McCullough and Nelder (1983) "Generalised Linear Models",
    page 138]

Since the GLM for a Poisson regression with log link is to model

  L = log(mu) = a + b1*X1 + B2*X2 + ...

mu is the Poisson mean, and where X1, X2, ... are the raw
(untransformed, unless you have other reasons for tranforming
them prior to bringing them into the regression) explanatory
variables, if X1 is the variable you wish to use as "offset"
in the above sense then it should be used un-transformed.
On this basis, the answer to John Sorkin's question should be:
don't use log(NumUniPt), use NumUniPt.

There's a potential confusion here in that presumably
"NumUniPt" may be a positive variable whose distribution
in the data may be skew, i.e. the sort of variable that
you may feel urged to take the log of before using it.

But that would be an "other reason" in the sense of my
comment above.

After all, suppose "NumUniPt" denoted a variable in the
data that could take negative values. Would you be happy
to use log(NumUniPt) in that case?

Best wishes to all,
Ted.


> On Sun, Mar 2, 2008 at 10:01 PM, John Sorkin
> <jsorkin at grecc.umaryland.edu> wrote:
>> R 2.6.0
>>  Windows XP
>>
>>  A question about running a generalized linear model.
>>
>>  I am running a glm with
>>  (1) a poisson distribution and a log link:
>>    family=poisson(link = "log")
>>  and an offset.
>>  I would like to know if I should express the offset as the log of the
>>  offset value, i.e.
>>  offset=log(NumUniqPt)
>>  or as:
>>  offset=NumUniqPt
>>
>>  I suspect I need to use the log, bu t I can't find any discussion of
>>  this in MASS 1994 or on the man page for glm.
>>  Thanks
>>  John
>>
>>
>>  John Sorkin M.D., Ph.D.
>>  Chief, Biostatistics and Informatics
>>  University of Maryland School of Medicine Division of Gerontology
>>  Baltimore VA Medical Center
>>  10 North Greene Street
>>  GRECC (BT/18/GR)
>>  Baltimore, MD 21201-1524
>>  (Phone) 410-605-7119
>>  (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>
>>  Confidentiality Statement:
>>  This email message, including any attachments, is for
>>  th...{{dropped:6}}
>>
>>  ______________________________________________
>>  R-help at r-project.org mailing list
>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>  PLEASE do read the posting guide
>>  http://www.R-project.org/posting-guide.html
>>  and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> ===============================
> WenSui Liu
> ChoicePoint Precision Marketing
> Phone: 678-893-9457
> Email : wensui.liu at choicepoint.com
> Blog   : statcompute.spaces.live.com
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 03-Mar-08                                       Time: 07:51:32
------------------------------ XFMail ------------------------------



More information about the R-help mailing list