[R] robust mlm in R?

Michael Friendly friendly at yorku.ca
Mon Jun 2 15:51:15 CEST 2008


I had no on-list replies, so I cobbled up a function for the simplest
method I could think of -- iterative multivariate trimming, following
Gnanadesikan, Kettering & Wilks, assigning 0 weights to observations
based on the Mahalanobis D^2 of residuals.

But I'm getting an error I don't understand, and neither traceback()
nor browser() gives me any insight.  Can anyone tell me what is wrong
with the lm() call in rmlm.GKW below?

 > iris.rmod <- rmlm.GKW(cbind(Sepal.Length, Sepal.Width, Petal.Length, 
Petal.Width)~Species, data=iris)
Error in model.frame.default(formula = formula, data = data, subset = 
subset,  :
   invalid type (closure) for variable '(weights)'
 >

Here are the functions:

# Mahalanobis Dsq for a matrix of variables
dsq <- function(x, Sigma) {
   if (missing(Sigma)) Sigma <- cov(x, use="complete.obs")
   dev <- scale(x, scale=FALSE)
# DSQ <- dev %*% solve(Sigma) %*% t(dev )
   DSQ <- apply(dev * (dev %*% solve(Sigma)), 1, sum)
   return(DSQ)
}

# robust mlm via multivariate trimming a la Gnanadesikan, Kettering & Wilks
rmlm.GKW <- function(formula, data, subset, weights=NULL, iter=3, 
pvalue=.01) {
   if (missing(weights) | is.null(weights)) { weights <- rep(1, 
nrow(data)) }
   last.weights <- weights
   for (i in 1:iter) {
     mod <- lm(formula=formula, data=data, subset=subset, weights=weights)
     res <- residuals(mod)
     coef <- mod$coefficients
     print (coef)
     p <- ncol(res)
     DSQ <- dsq(res)
     prob <- pchisq(DSQ, p, lower.tail=FALSE)
     weights <- ifelse( prob<pvalue, 0, weights)
     nzero <- which(weights=0)
     print (nzero)
     if (all.equal(weights, last.weights)) { break }
   }
}

Michael Friendly wrote:
> I'm looking for something in R to fit a multivariate linear model 
> robustly, using
> an M-estimator or any of the myriad of other robust methods for linear 
> models
> implemented in robustbase or methods based on MCD or MVE covariance
> estimation (package rrcov).
> 
> E.g., one can fit an mlm for the iris data as:
> iris.mod <- lm(cbind(Sepal.Length, Sepal.Width, Petal.Length, 
> Petal.Width) ~ Species, data=iris)
> 
> What I'd like is something like rlm() in MASS, but handling an mlm, e.g.,
> iris.mod <- rmlm(cbind(Sepal.Length, Sepal.Width, Petal.Length, 
> Petal.Width) ~ Species, data=iris)
> and returning a vector of observation weights in its result.
> 
> There's a burgeoning literature on this topic, but I haven't yet found 
> computational methods.
> Any pointers or suggestions would be appreciated.
> 
> -Michael
> 
> 


-- 
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street    http://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA



More information about the R-help mailing list