[R] Re : Problems with generalized linear model (glm) coefficients.

David Winsemius dwinsemius at comcast.net
Wed Mar 7 18:46:44 CET 2012

On Mar 7, 2012, at 12:13 PM, Lucas wrote:

> Thank you.
> What could be a "User Error"? where could I be making a mistake?

Please consider what you are requesting.  Attempting to enumerate the  
possible errors would start with data input errors, progress through  
transformational errors and finish up with analysis errors.  It would  
expand from that broad classification into something like the  
accumulated sampling  of user errors recorded in the 15 years of Rhelp  
archives as a lower bound, which all users know is not inclusive since  
we have made additional errors in the privacy of our own bedrooms.  
Without a reproducible sample of data and code there can be no  
specificity. The discipline of constructing a reproducible question to  
R-help has led to quite a few solutions to my errors, (which were then  
not posted.)

> I cannot use a lm because my data is not normal, is categorical (count
> data). So my first option was Poisson, but had severe overdispersion
> problems so I used Binomial Negative as an option.
> Thank you for taking the time to answer my questions.
> Lucas.
> 2012/3/7 peter dalgaard <pdalgd at gmail.com>
>> On Mar 7, 2012, at 15:02 , Lucas wrote:
>>> Hi Pascal.
>>> I applied my analysis in time. I have 25 fire seasons, each season  
>>> starts
>>> on November and ends up on April (our summer)
>> Hey, why are you worrying about regression coefficients.  
>> _Everything_ is
>> upside-down at your place... ;-)
>>> , so I have used them as
>>> independent observations. I know that assumption it could be  
>>> wrong, but
>> is
>>> the only way I can use the information available.
>> As a general matter, there are three possibilities
>> 1) User error
>> 2) Method artifact
>> 3) Counterintuitive (but true) relation
>> and you really need to keep the possibility of 3) in mind rather than
>> poking around hoping that the counterintuitive signs would go away by
>> themselves.
>> To investigate, I think I would make some stabs that try to get  
>> closer to
>> the raw data. If you produce a plot showing that the average number  
>> of
>> fires is increasing with temperature and a model fit with  
>> temperature as
>> the only predictor apparently shows the opposite, then I'd suspect  
>> a user
>> error causing coefficients not to mean what you think they mean.
>>> Thank you.
>>> 2012/3/7 Pascal Oettli <kridox at ymail.com>
>>>> Hi Lucas,
>>>> Do you apply your analysis in time or in space?
>>>> Regards,
>>>> Pascal
>>>> ----- Mail original -----
>>>> De : Lucas <lpchaparrovio at gmail.com>
>>>> À : r-help at r-project.org
>>>> Cc :
>>>> Envoyé le : Mercredi 7 mars 2012 22h34
>>>> Objet : [R] Problems with generalized linear model (glm)  
>>>> coefficients.
>>>> Hello to everyone.
>>>> I´m writing you because I´m feeling a bit frustrated with my  
>>>> work.
>>>> My work consists in finding  the relation between the amount of  
>>>> fires
>> and
>>>> the weather, so, my response variable is the amount of fires in a  
>>>> fire
>>>> season and the explanatory variables are the temperature, the  
>>>> amount of
>>>> precipitation and the some others∑. my problem is this; I keep  
>>>> getting
>> the
>>>> wrong sign in the coefficients estimated, I get a negative sign for
>>>> temperature and a positive sign for precipitation, which is
>> unreasonable,
>>>> the greater the temperature I would expect more fire, on the  
>>>> contrary,
>> the
>>>> greater the precipitation I would expect less fires.  So far I  
>>>> have deal
>>>> with overdispersion, multicollinearity  and the amount of zeroes  
>>>> through
>>>> passing from Poisson to Negative Binomial and Hurdle.  I believe  
>>>> I have
>>>> used all my options and still have the wrong signs on my  
>>>> coefficients.
>>>> Do I have more options? What does it mean that I keep getting those
>> signs?
>>>> If anyone could help me I would really appreciate it.

David Winsemius, MD
West Hartford, CT

More information about the R-help mailing list