[R] Goodness of fit with robust regression

Spencer Graves spencer.graves at structuremonitoring.com
Mon Mar 14 16:54:41 CET 2011


       I'm not an expert on robust modeling.  However, as far as I know, 
most robust regression procedures are based on heuristics, justified by 
claims that "it seems to work" rather than reference to assumptions 
about a probability model that makes the procedures "optimal".  There 
may be exceptions for procedures that assume a linear model plus noise 
that follows a student's t distribution or a contaminated normal.  Thus, 
if you can't get traditional R-squares from a standard robust regression 
function, it may be because the people who wrote the function thought 
that R-squared (as, "percent of variance explained") did not make sense 
in that context.  This is particularly true for robust general linear 
models.


       Fortunately, the prospects are not as grim as this explanation 
might seem:  The summary method for an "lmrob" object (from the 
robustbase package) returned for me the standard table with estimated, 
standard errors, t values, and p values for the regression 
coefficients.  The robustbase package also includes an anova method for 
two nested lmrob models.  This returns pseudoDF (a replacement for the 
degrees of freedom), Test.Stat (analogous to 2*log(likelihood ratio)), 
Df, and Pr(>chisq).  In addition to the 5 References in the lmrob help 
page, help(pac=robustbase) says, it is ' "Essential" Robust Statistics.  
The goal is to provide tools allowing to analyze data with robust 
methods.  This includes regression methodology including model 
selections and multivariate statistics where we strive to cover the book 
"Robust Statistics, Theory and Methods" by Maronna, Martin and Yohai; 
Wiley 2006.'


       I chose to use lmrob, because it seemed the obvious choice from a 
search I did of Jonathan Baron's database of contributed R packages:


library(sos)
rls <- findFn('robust fit') # 477 matches;  retrieved 400
rls.m <- findFn('robust model')# 2404 matches;  retrieved 400
rls. <- rls|rls.m # union of the two searchs
installPackages(rls.)
# install missing packages with many matches
# so we can get more information about those packages
writeFindFn2xls(rls.)
# Produce an Excel file with a package summary
# as well a table of the individual matches


       Hope this helps.
       Spencer Graves


p.s.  The functions in MASS are very good.  I did not use rlm in this 
case primarily because MASS was package number 27 in the package summary 
in the Excel file produced by the above script.  Beyond that, 
methods(class='rlm') identified predict, print, se.contrast, summary and 
vcov methods for rlm objects, and showMethods(class='rlm') returned 
nothing.  Conclusion:  If there is an anova method for rlm objects, I 
couldn't find it.


On 3/14/2011 7:00 AM, agent dunham wrote:
> I also have the same problem, can anybody help?
>
> and I would also like to see the p-values associated with the t-value of the
> coefficients.
>
> At present I type summary (mod1.rlm) and neither of these things appear.
>
> Thanks, user at host.com
>
> --
> View this message in context: http://r.789695.n4.nabble.com/R-Goodness-of-fit-with-robust-regression-tp809412p3353919.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list