[R] Collinearity in Linear Multiple Regression

Stephan Kolassa Stephan.Kolassa at gmx.de
Wed Jul 22 20:06:18 CEST 2009


Hi Tim,

Variance proportions (and condition indices) are exactly the tools
described in Belsley, Kuh & Welsch, "Regression Diagnostics" - see my 
previous post. Good to see I'm not the only one to use them! BKW also 
describe in detail how to calculate all this using SVD, so you don't 
need to use SAS...

And I certainly agree that a problematic system means that you need to 
do more work - probably either collect more data or refine your research 
agenda, as collinearity may just be inherent in the independent 
variables you have been collecting.

Best,
Stephan


Tim Paysen schrieb:
> Actually, the CI index and VIF are just a start.  It is best to look
> at what they call a matrix of "variance proportions" (found in SAS
> and a few other places...)--which hardly anyone understands
> (including the SAS folks).  It is a matrix of estimates of what the
> variences of the regression coefficients would be if you could figure
> them out in the first place.  It shows which factors dominate over
> others IN THE PARTICULAR SETUP you are analyzing.  The matrix is
> often calculated using eigenvalues, but is best done with Singular
> Value Decomposition techniques (you don't have to have a square
> matrix, and you maintain better precision).  Analysts will say that
> it can display an unstable system -- which is correct, but they
> generally say that, if its true, you have bad data and should throw
> it out--or collect more.  I suggest care, because it may be
> illustrating the nature of the system you are studying.
> 
> The only decent reference that I know of is a little book (hard to
> read) that I can't remember off the top of my head.  Have to look it
> up.
> 
> Timothy E. Paysen, Phd Research Forester (ret.)
> 
> 
> 
> 
> ________________________________ From: John Sorkin
> <jsorkin at grecc.umaryland.edu> To: Alex Roy <alexroy2008 at gmail.com>;
> r-help at r-project.org Sent: Tuesday, July 21, 2009 4:19:11 AM Subject:
> Re: [R] Collinearity in Linear Multiple Regression
> 
> I suggest you start by doing some reading about Condition index (CI)
> and variation inflation factor (VIF). Once you have reviewed the
> theory, a search of search.r-project.org (under the help menu in a
> windows-based R installation) for VIF will help you obtain values for
> VIF, c.f. http://finzi.psych.upenn.edu/R/library/HH/html/vif.html 
> John
> 
> John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics 
> University of Maryland School of Medicine Division of Gerontology 
> Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) 
> Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913
> (Please call phone number above prior to faxing)
> 
>>>> Alex Roy <alexroy2008 at gmail.com> 7/21/2009 7:01 AM >>>
> Dear all, How can I test for collinearity in the predictor data set 
> for multiple linear regression.
> 
> Thanks
> 
> Alex
> 
> [[alternative HTML version deleted]]
> 
> ______________________________________________ R-help at r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
> 
> Confidentiality Statement: This email message, including any
> attachments, is for th...{{dropped:11}}
> 
> 
> 
> ------------------------------------------------------------------------
> 
> 
> ______________________________________________ R-help at r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list