[Rd] plot.lm: "Cook's distance" label can overplot point labels

John Maindonald John.Maindonald at anu.edu.au
Thu Feb 19 08:53:39 CET 2009


Actually, the contours and the smooth are currently printed with  
col=2.  This prints satisfactorily in grayscale.    Colours ("orange"  
and "darkred" as well as col=2) are also used in termplot.

Does the stricture against "colour" extend to grayscale?  Does it  
apply to lines as well as text?

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 19/02/2009, at 5:58 PM, Prof Brian Ripley wrote:

> On Wed, 18 Feb 2009, John Fox wrote:
>
>> Dear John,
>>
>>> -----Original Message-----
>>> From: John Maindonald [mailto:John.Maindonald at anu.edu.au]
>>> Sent: February-18-09 4:57 PM
>>> To: John Fox
>>> Cc: 'Martin Maechler'; r-devel at r-project.org
>>> Subject: Re: [Rd] plot.lm: "Cook's distance" label can overplot  
>>> point
>> labels
>>>
>>> Dear John -
>>> The title above the graph is also redundant for the first of the
>>> plots; do we want to be totally consistent?  I am not sure.
>>
>> Why not? "A foolish consistency is the hobgoblin of little minds,"  
>> but maybe
>> this isn't a foolish consistency.
>>
>>>
>>> It occurs to me that the text "Cook's distance", as well as the
>>> contours, might be in red.
>>
>> That would provide a nice visual cue (for those who aren't colour  
>> blind).
>
> Or using a black-and-white device.  We have not hitherto assumed a  
> colour device in 'stats' graphics, and given how often they are  
> printed I don't think we want to start.
>
> As so often, it seems that what looks good is in the eye of the  
> beholder.  If the two of you can agree on something that you both  
> see is a definite improvement, please provide a patch and examples  
> to try to persuade everyone else.  (As a Wishlist item on R-bugs, so  
> it gets recorded.)
>
>>
>> Best,
>> John
>>
>>> Regards
>>> John.
>>>
>>> John Maindonald             email: john.maindonald at anu.edu.au
>>> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
>>> Centre for Mathematics & Its Applications, Room 1194,
>>> John Dedman Mathematical Sciences Building (Building 27)
>>> Australian National University, Canberra ACT 0200.
>>>
>>>
>>> On 18/02/2009, at 12:27 PM, John Fox wrote:
>>>
>>>> Dear John,
>>>>
>>>> It occurs to me that the title above the graph, "Residuals vs.
>>>> Leverage," is
>>>> entirely redundant since the x-axis is labelled "Leverage" and  
>>>> the y-
>>>> axis
>>>> "Studentized residuals." Why not use the title above the graph for
>>>> "Cook's
>>>> distance countours"?
>>>>
>>>> Regards,
>>>> John
>>>>
>>>>> -----Original Message-----
>>>>> From: r-devel-bounces at r-project.org
>> [mailto:r-devel-bounces at r-project.org
>>>>> ]
>>>> On
>>>>> Behalf Of John Maindonald
>>>>> Sent: February-17-09 5:54 PM
>>>>> To: r-devel at r-project.org
>>>>> Cc: Martin Maechler
>>>>> Subject: [Rd] plot.lm: "Cook's distance" label can overplot point
>>>>> labels
>>>>>
>>>>> The following code demonstrates an annoyance with plot.lm():
>>>>>
>>>>> library(DAAGxtras)
>>>>> x11(width=3.75, height=4)
>>>>> nihills.lm <- lm(log(time) ~ log(dist) + log(climb), data =  
>>>>> nihills)
>>>>> plot(nihills.lm, which=5)
>>>>>
>>>>> OR try the following
>>>>> xy <- data.frame(x=c(3,1:5), y=c(-2, 1:5))
>>>>> plot(lm(y ~ x, data=xy), which=5)
>>>>>
>>>>> The "Cook's distance" text overplots the label for the point  
>>>>> with the
>>>>> smallest residual.  This is an issue when the size of the plot is
>>>>> much
>>>>> less than the default, and the pointsize is not reduced
>>>>> proportionately.
>>>>>
>>>>>
>>>>> I suggest the following:
>>>>>    xx <- hii
>>>>>    xx[xx >= 1] <- NA
>>>>> ## Insert new code
>>>>>    fracht <- (1.25*par()$cin[2])/par()$pin[2]
>>>>>    ylim[1] <- ylim[1] - diff(ylim)*max(0, fracht-0.04)
>>>>> ## End insert new code
>>>>>    plot(xx, rsp, xlim = c(0, max(xx, na.rm = TRUE)),
>>>>>         ylim = ylim, main = main, xlab = "Leverage",
>>>>>         ylab = ylab5, type = "n", ...)
>>>>>
>>>>> Then, about 15 lines further down, replace
>>>>>      legend("bottomleft", legend = "Cook's distance",
>>>>>             lty = 2, col = 2, bty = "n")
>>>>>
>>>>> by
>>>>>      legend("bottomleft", legend = "Cook's distance",
>>>>>             lty = 2, col = 2, bty = "n", y.intersp=0.5)
>>>>>
>>>>> If this second change is not made, then one wants fracht <-
>>>>> (1.5*par()
>>>>> $cin[2])/par()$pin[2]
>>>>> I prefer the "Cook's distance" text to be a bit closer to the x- 
>>>>> axis,
>>>>> as it separates it more clearly from any point labels.
>
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list