[R] Graphical presentation of logistic regression

Thu Sep 15 15:05:01 CEST 2005

Jari Oksanen wrote:
> On Wed, 2005-09-14 at 06:29 -0500, Frank E Harrell Jr wrote:
> 
>>Beale, Colin wrote:
>>
>>>Hi,
>>>
>>>I wonder if anyone has written any code to implement the suggestions of
>>>Smart et al (2004) in the Bulletin of the Ecological Society of America
>>>for a new way of graphically presenting the results of logistic
>>>regression (see
>>>www.esapubs.org/bulletin/backissues/085-3/bulletinjuly2004_2column.htm#t
>>>ools1 for the full text)? I couldn't find anything relating to this sort
>>>of graphical representation of logistic models in the archives, but
>>>maybe someone has solved it already? In short, Smart et al suggest that
>>>a logistic regression be presented as a combination of the two
>>>histograms for successes and failures (with one presented upside down at
>>>the top of the figure, the other the right way up at the bottom)
>>>overlaid by the probability function (ie logistic curve). It's somewhat
>>>hard to describe, but is nicely illustrated in the full text version
>>>above. I think it is a sensible way of presenting these results and am
>>>keen to do so - at the moment I can only do this by generating the two
>>>histograms and the logistic curve separately (using hist() and lines()),
>>>then copying and pasting the graphs out of R and inverting one in a
>>>graphics package, before overlying the others. I'm sure this could be
>>>done within R and would be a handy plotting function to develop. Has
>>>anyone done so, or can anyone give me any pointers to doing this? I
>>>really nead to know how to invert a histogram and how to overlay this
>>>with another histogram "the right way up".
>>>
>>>Any thoughts would be welcome.
>>>
>>>Thanks in advance,
>>>Colin
>>
>> From what you describe, that is a poor way to represent the model 
>>except for judging discrimination ability (if the model is calibrated 
>>well).  Effect plots, odds ratio charts, and nomograms are better.  See 
>>the Design package for details.
>>
> 
> 
> You're correct when you say that this is a poor way to represent the
> model. However, you should have some understanding to us ecologists who
> are simple creatures working with tangible subjects such as animals and
> plants (microbiologists work with less tangible things). Therefore we
> want to have a concrete and simple representation. After all, the
> example was about occurrence of an animal against a concrete
> environmental variable, and a concrete representation was suggested.
> Nomograms and things are abstractions that you understand first after
> long education and training (I tried the Design package and I didn't
> understand the nomogram plot). 

I don't understand why you think the histograms are "representing the 
model".  That approach even seems to be interchanging the roles of the 
independent and dependent variables.

> 
> I tried with one concrete example with my own data, and the inverted
> histogram method was patently misleading (with Baz Rowlingson's neat and
> compact code, sorry for the repetition). The method would be useful with
> dense and regular data only, but now the clearest visual cue was the
> uneven sampling intensity. With my limited knowledge on R facilities, I
> can now remember only two ways two preserve the concreteness of display
> in the base R: jitter() to avoid overplotting of observations, and
> sunflowerplot() to show the amount of overplotting.
> 
> I think Ecological Society of America would be happy to receive papers
> to suggest better ways to represent binary response data, if some of the
> knowledgeable persons in this groups would decided to educate them (I'm
> not an ESA member, so I wouldn't be educated: therefore 'them' instead
> of 'us'). The ESA bulletin will be influential in manuscript submitted
> to the Society journals in the future, and the time for action is now.

See

@Article{gui00ord,
   author = 		 {Guisan, Antoine and Harrell, Frank E.},
   title = 		 {Ordinal response regression models in
ecology},
   journal = 	 {Journal of Vegetation Science},
   year = 		 2000,
   volume = 11,
   pages = {617-626},
   annote =		 {teaching;ordinal logistic model}
}

This is more complex than needed (ordinal instead of binary) but binary 
is a special case of ordinal.

Cheers,

Frank

> 
> cheers, jari oksanen

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University