[R] Linear Regression Problem

Liaw, Andy andy_liaw at merck.com
Tue Jul 14 17:43:37 CEST 2009


For the coefficient to be equal to the correlation, you need to scale y as well.

You can get the correlations by something like the following and then back-calculate the coefficients from there.

R> x = matrix(rnorm(100*4e4), 100, 4e4)
R> y = rnorm(100)
R> rxy = cor(x, cbind(y))

Andy 

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Alex Roy
> Sent: Tuesday, July 14, 2009 11:29 AM
> To: Vito Muggeo (UniPa)
> Cc: r-help at r-project.org
> Subject: Re: [R] Linear Regression Problem
> 
> Dear Vito,
>                 Thanks for your comments. But I want to do 
> Simple linear
> regression not Multiple Linear regression. Multiple Linear 
> regression is not
> possible here as number of variables are much more than 
> samples.( X is ill
> condioned, inverse of X^TX does not exist! )
> I just want to take one predictor variable and regress on y and store
> regression coefficients, p values and R^2 values. And the 
> loop go up to
> 40,000 predictors.
> 
> Alex
> On Tue, Jul 14, 2009 at 5:18 PM, Vito Muggeo (UniPa)
> <vito.muggeo at unipa.it>wrote:
> 
> > dear Alex,
> > I think your problem with a large number of predictors and 
> a relatively
> > small number of subjects may be faced via some 
> regularization approach
> > (ridge or lasso regression..)
> >
> > hope this helps you,
> > vito
> >
> > Alex Roy ha scritto:
> >
> >>  Dear All,
> >>                 I have a matrix  say, X ( 100 X 40,000) 
> and a vector say,
> >> y
> >> (100 X 1) . I want to perform linear regression. I have 
> scaled  X matrix
> >> by
> >> using scale () to get mean zero and s.d 1  . But still I 
> get very high
> >> values of regression coefficients.  If I scale X matrix, then the
> >> regression
> >> coefficients will bahave as a correlation coefficient and 
> they should not
> >> be
> >> more than 1. Am I right? I do not whats going wrong.
> >> Thanks for your help.
> >> Alex
> >>
> >>
> >> *Code:*
> >>
> >> UniBeta <- sapply(1:dim(X)[2], function(k)
> >> + summary(lm(y~X[,k]))$coefficients[2,1])
> >>
> >> pval <- sapply(1:dim(X)[2], function(l)
> >> + summary(lm(y~X[,l]))$coefficients[2,4])
> >>
> >>        [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> 
> http://www.R-project.org/posting-guide.html<http://www.r-proje
> ct.org/posting-guide.html>
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> > --
> > ====================================
> > Vito M.R. Muggeo
> > Dip.to Sc Statist e Matem `Vianelli'
> > Università di Palermo
> > viale delle Scienze, edificio 13
> > 90128 Palermo - ITALY
> > tel: 091 6626240
> > fax: 091 485726/485612
> > http://dssm.unipa.it/vmuggeo
> > ====================================
> >
> 
> 	[[alternative HTML version deleted]]
> 
> 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}




More information about the R-help mailing list