[R] Fitting linear models

Dimitri Liakhovitski ld7631 at gmail.com
Tue Apr 21 17:35:52 CEST 2009


Can we see your data to be able to replicate the error? Or maybe a
subset of data with some fake variable names?

On Tue, Apr 21, 2009 at 11:32 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
> Yes, they are all of the same length.
>
> -----Original Message-----
> From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
> Sent: Tuesday, April 21, 2009 8:32 AM
> To: Vemuri, Aparna
> Cc: r-help at r-project.org
> Subject: Re: [R] Fitting linear models
>
> Are they of the same length?
>
> On Tue, Apr 21, 2009 at 11:31 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
>> The variables are all in separate vectors.
>>
>> -----Original Message-----
>> From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
>> Sent: Tuesday, April 21, 2009 8:26 AM
>> To: Vemuri, Aparna
>> Cc: David Winsemius; r-help at r-project.org
>> Subject: Re: [R] Fitting linear models
>>
>> Aparna,
>>
>> I should have been more explicit. Run ?lm . You'll see this:
>>
>> "lm(formula, data, subset, weights, na.action,
>>   method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
>>   singular.ok = TRUE, contrasts = NULL, offset, ...)"
>>
>> So, in addition to specifying the formula, you have to specify the
>> data frame in which you keep your variables. I assume they are in a
>> data frame? (unless for some reasons you keep all variables as
>> separate vectors).
>> So, after you wrote the formula, you have to indicate the name of the
>> data frame, for example "MyData":
>>
>> model1<-lm(PBW~SO4+NO3+NH4, MyData)
>>
>> Dimitri
>>
>> On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
>>> David,
>>> Thanks for the suggestions. No, I did not label my dependent variable "function".
>>>
>>> My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient.  Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so naïve to expect a regression coefficient on all of them.
>>>
>>> Dimitri
>>> model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
>>>
>>> Bert:
>>>  This is not homework. But I will remember to do my research before posting here.
>>>
>>> Aparna
>>>
>>>
>>> -----Original Message-----
>>> From: David Winsemius [mailto:dwinsemius at comcast.net]
>>> Sent: Monday, April 20, 2009 5:35 PM
>>> To: Vemuri, Aparna
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] Fitting linear models
>>>
>>>
>>> On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:
>>>
>>>> I am not sure if this is an R-users question, but since most of you
>>>> here
>>>> are statisticians, I decided to give it a shot.
>>>
>>> You can omit the unnecessary preambles.
>>>>
>>>>
>>>> I am using the lm() function in R to fit a dependent variable to a set
>>>> of 3 to 5 independent variables. For this, I used the following
>>>> commands:
>>>>
>>>>> model1<-lm(function=PBW~SO4+NO3+NH4)
>>>> Coefficients:
>>>> (Intercept)          SO4          NO3      NH4
>>>>    0.01323      0.01968      0.01856           NA
>>>>
>>>> and
>>>>
>>>>> model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
>>>>
>>>> Coefficients:
>>>> (Intercept)               SO4                  NO3      NH4
>>>> Na       Cl
>>>> -0.0006987   -0.0119750   -0.0295042    0.0842989    0.1344751
>>>> NA
>>>>
>>>> In both cases, the last independent variable has a coefficient of NA
>>>> in
>>>> the result. I say last variable because, when I change the order of
>>>> the
>>>> variables, the coefficient changes (see below). Can anyone point me to
>>>> the reason R behaves this way?  Is there anyway for me to force R to
>>>> use
>>>> all the variables? I checked the correlation matrices to makes sure
>>>> there is no orthogonality between the variables.
>>>
>>> You really did not name your dependent variable "function" did you?
>>> Please stop that.
>>>
>>> Just a guess, ... since you have not provided enough information to do
>>> otherwise, ... Are all of those variables 1/0 dummy variables? If so
>>> and if you want to have an output that satisfies your need for
>>> labeling the coefficients as you naively anticipate, then put "0+" at
>>> the beginning of the formula or "-1" at the end, so that the intercept
>>> will disappear and then all variables will get labeled as you expect.
>>>
>>> --
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Dimitri Liakhovitski
>> MarketTools, Inc.
>> Dimitri.Liakhovitski at markettools.com
>>
>
>
>
> --
> Dimitri Liakhovitski
> MarketTools, Inc.
> Dimitri.Liakhovitski at markettools.com
>



-- 
Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com




More information about the R-help mailing list