[R] Highlighting points in a scatter plot matrix

Mulholland, Tom Tom.Mulholland at dpi.wa.gov.au
Tue Mar 22 09:20:48 CET 2005


There are two issues here identifying the outliers and highlighting them.

I have only a basic grasp of both of these concepts but will give what I have in case it helps. There appears to have been a move in the last 2 decades to improve the concepts of what actually constitutes an outlier, Brian Ripley made comment on this in 2003 when he said "That's the whole point of robust methods: compensate rather than reject." So I would suggest that you might like to find a copy of an article cited by Brian last year http://finzi.psych.upenn.edu/R/Rhelp02a/archive/35340.html 

As Uwe has pointed out if you are using pairs than you will have to write your own panel function unless someone has already written something. I have avoided using the panel function as it seems a bit cumbersome in comparison to writing your own using normal plots.

I haven't used the lattice package for a while now but it is obvious that major improvements have been made recently and you may find that this is a better vehicle for plotting your data.

However for a single plot there's no real problem.
plot(x,y,pch = 20, col = "navy")
points(x[outlier],y[outlier],pch = 20, col = "red")

where "outlier" are the observations you consider to be such

A crude example of what can be done rather than what should be done is (I have used inappropriate data)

par(mfrow = c(4,4))
# Just select setosa
iris <- iris[1:50,]

for (j in 1:4){
  for (k in 1:4){
  if (j == k){
    plot(5,axes = FALSE,type = "n",xlab = "",ylab = "")
    } else {
    mah <- mahalanobis(iris[,c(j,k)],rowMeans(iris[,c(j,k)]),cov(iris[,c(j,k)]))
    outlier <- which(mah > quantile(mah,.95))

    plot(iris[,j],iris[,k],pch = 20, col = "navy",axes = F,xlab = names(iris)[j],ylab = names(iris)[k])
    points(iris[outlier,j],iris[outlier,k],pch = 20, col = "red")
    }
    }
    }
    



> -----Original Message-----
> From: Brett Stansfield [mailto:brett at hbrc.govt.nz]
> Sent: Tuesday, 22 March 2005 6:09 AM
> To: R help (E-mail)
> Subject: [R] Highlighting points in a scatter plot matrix
> 
> 
> Dear R
> I recently did a scatterplot matrix using the following command
> pairs(sleep[c("SlowSleep", "ParaSleep", "logbw", "logbrw", "loglife",
> "loggest")],col=1+as.integer(ParaSleep > 5.5 | SlowSleep > 15.7))
> this highlighted outlying points for some of the x,y plots 
> that I needed to
> identify. Unfortunately this highlights all the x,y plots 
> some for which
> these points are not necessarily outliers. Is there a way to specify
> highlighting selected points at selected x,y plots within a matrix?
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list