[R] Selecting variables in a multivariate regression

peter dalgaard pdalgd at gmail.com
Mon Apr 14 16:11:37 CEST 2014


On 14 Apr 2014, at 15:33 , Bert Gunter <gunter.berton at gene.com> wrote:

> Well, this is your second post on the same topic, your first having
> received no response. So you should suspect something is amiss and
> reconsider before continuing, don't you think?
> 
> 1. I, for one, was not able to make any sense of your query. You do
> not appear to understand regression, so I would suggest you spend time
> with a local statistical resource before continuing with online
> posts.If my understanding of your misunderstanding is correct, you
> need to comprehend basics. If not,apologies.
> 

The problem as such makes OK sense to me: multivariate linear model, not all regressors affecting all outputs. The simplest case of this is what is known as "seemingly unrelated regressions". The thing not known/understood by the poster is that such models are outside the scope of the MANOVA type models, which is all lm() is designed to do. The "sem" and "systemfit" packages may be of help, but some reading and/or consultation with someone with the relevant expertise may be necessary.

Peter D.

> 2. Have you read An Introduction to R (ships with R) or an online R
> tutorial of your choice? If not, do so before posting here further. We
> expect minimal efforts of posters to solve their own problems before
> posting. Again, apologies if I err.
> 
> Cheers,
> Bert
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
> 
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> H. Gilbert Welch
> 
> 
> 
> 
> On Sun, Apr 13, 2014 at 8:08 PM, Edson Tirelli <ed.tirelli at gmail.com> wrote:
>> I am quite new to R and I am having trouble figuring out how to select
>> variables in a multivariate linear regression in R. My google-fu also
>> did not find anything.
>> 
>> Pretend I have the following formulas:
>> 
>> P = aX + bY
>> Q = cZ + bY
>> 
>> I have a data frame with column P, Q, X, Y, Z and I need to find a, b and c.
>> 
>> If I do a simple multivariate regression:
>> 
>> result <- lm( cbind( P, Q ) ~ X + Y + Z - 1 )
>> 
>> It calculates a coefficient for "c" on P's regression and for "a" on
>> Q's regression.
>> 
>> If I calculate the regressions individually then "b" will be different
>> in each regression.
>> 
>> How can I select the variables to consider in a multivariate
>> regression? I.e., how do I tell R to ignore cZ when calculating P, and
>> ignore aX when calculating Q?
>> 
>> Thank you,
>> Edson
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com




More information about the R-help mailing list