[Rd] plot.lm: "Cook's distance" label can overplot point labels

John Fox jfox at mcmaster.ca
Thu Feb 19 13:57:06 CET 2009


Dear John and Brian,

My point about colour-blindness was partly tongue-in-cheek, but I think that
it's a bad choice to have the second and third colours in the default
palette as red and green.

Regards,
 John


> -----Original Message-----
> From: John Maindonald [mailto:John.Maindonald at anu.edu.au]
> Sent: February-19-09 2:54 AM
> To: Prof Brian Ripley
> Cc: John Fox; r-devel at r-project.org; 'Martin Maechler'
> Subject: Re: [Rd] plot.lm: "Cook's distance" label can overplot point
labels
> 
> Actually, the contours and the smooth are currently printed with
> col=2.  This prints satisfactorily in grayscale.    Colours ("orange"
> and "darkred" as well as col=2) are also used in termplot.
> 
> Does the stricture against "colour" extend to grayscale?  Does it
> apply to lines as well as text?
> 
> John Maindonald             email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
> 
> 
> On 19/02/2009, at 5:58 PM, Prof Brian Ripley wrote:
> 
> > On Wed, 18 Feb 2009, John Fox wrote:
> >
> >> Dear John,
> >>
> >>> -----Original Message-----
> >>> From: John Maindonald [mailto:John.Maindonald at anu.edu.au]
> >>> Sent: February-18-09 4:57 PM
> >>> To: John Fox
> >>> Cc: 'Martin Maechler'; r-devel at r-project.org
> >>> Subject: Re: [Rd] plot.lm: "Cook's distance" label can overplot
> >>> point
> >> labels
> >>>
> >>> Dear John -
> >>> The title above the graph is also redundant for the first of the
> >>> plots; do we want to be totally consistent?  I am not sure.
> >>
> >> Why not? "A foolish consistency is the hobgoblin of little minds,"
> >> but maybe
> >> this isn't a foolish consistency.
> >>
> >>>
> >>> It occurs to me that the text "Cook's distance", as well as the
> >>> contours, might be in red.
> >>
> >> That would provide a nice visual cue (for those who aren't colour
> >> blind).
> >
> > Or using a black-and-white device.  We have not hitherto assumed a
> > colour device in 'stats' graphics, and given how often they are
> > printed I don't think we want to start.
> >
> > As so often, it seems that what looks good is in the eye of the
> > beholder.  If the two of you can agree on something that you both
> > see is a definite improvement, please provide a patch and examples
> > to try to persuade everyone else.  (As a Wishlist item on R-bugs, so
> > it gets recorded.)
> >
> >>
> >> Best,
> >> John
> >>
> >>> Regards
> >>> John.
> >>>
> >>> John Maindonald             email: john.maindonald at anu.edu.au
> >>> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> >>> Centre for Mathematics & Its Applications, Room 1194,
> >>> John Dedman Mathematical Sciences Building (Building 27)
> >>> Australian National University, Canberra ACT 0200.
> >>>
> >>>
> >>> On 18/02/2009, at 12:27 PM, John Fox wrote:
> >>>
> >>>> Dear John,
> >>>>
> >>>> It occurs to me that the title above the graph, "Residuals vs.
> >>>> Leverage," is
> >>>> entirely redundant since the x-axis is labelled "Leverage" and
> >>>> the y-
> >>>> axis
> >>>> "Studentized residuals." Why not use the title above the graph for
> >>>> "Cook's
> >>>> distance countours"?
> >>>>
> >>>> Regards,
> >>>> John
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: r-devel-bounces at r-project.org
> >> [mailto:r-devel-bounces at r-project.org
> >>>>> ]
> >>>> On
> >>>>> Behalf Of John Maindonald
> >>>>> Sent: February-17-09 5:54 PM
> >>>>> To: r-devel at r-project.org
> >>>>> Cc: Martin Maechler
> >>>>> Subject: [Rd] plot.lm: "Cook's distance" label can overplot point
> >>>>> labels
> >>>>>
> >>>>> The following code demonstrates an annoyance with plot.lm():
> >>>>>
> >>>>> library(DAAGxtras)
> >>>>> x11(width=3.75, height=4)
> >>>>> nihills.lm <- lm(log(time) ~ log(dist) + log(climb), data =
> >>>>> nihills)
> >>>>> plot(nihills.lm, which=5)
> >>>>>
> >>>>> OR try the following
> >>>>> xy <- data.frame(x=c(3,1:5), y=c(-2, 1:5))
> >>>>> plot(lm(y ~ x, data=xy), which=5)
> >>>>>
> >>>>> The "Cook's distance" text overplots the label for the point
> >>>>> with the
> >>>>> smallest residual.  This is an issue when the size of the plot is
> >>>>> much
> >>>>> less than the default, and the pointsize is not reduced
> >>>>> proportionately.
> >>>>>
> >>>>>
> >>>>> I suggest the following:
> >>>>>    xx <- hii
> >>>>>    xx[xx >= 1] <- NA
> >>>>> ## Insert new code
> >>>>>    fracht <- (1.25*par()$cin[2])/par()$pin[2]
> >>>>>    ylim[1] <- ylim[1] - diff(ylim)*max(0, fracht-0.04)
> >>>>> ## End insert new code
> >>>>>    plot(xx, rsp, xlim = c(0, max(xx, na.rm = TRUE)),
> >>>>>         ylim = ylim, main = main, xlab = "Leverage",
> >>>>>         ylab = ylab5, type = "n", ...)
> >>>>>
> >>>>> Then, about 15 lines further down, replace
> >>>>>      legend("bottomleft", legend = "Cook's distance",
> >>>>>             lty = 2, col = 2, bty = "n")
> >>>>>
> >>>>> by
> >>>>>      legend("bottomleft", legend = "Cook's distance",
> >>>>>             lty = 2, col = 2, bty = "n", y.intersp=0.5)
> >>>>>
> >>>>> If this second change is not made, then one wants fracht <-
> >>>>> (1.5*par()
> >>>>> $cin[2])/par()$pin[2]
> >>>>> I prefer the "Cook's distance" text to be a bit closer to the x-
> >>>>> axis,
> >>>>> as it separates it more clearly from any point labels.
> >
> >
> > --
> > Brian D. Ripley,                  ripley at stats.ox.ac.uk
> > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford,             Tel:  +44 1865 272861 (self)
> > 1 South Parks Road,                     +44 1865 272866 (PA)
> > Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list