[R] Factors in an regression using lm()

Ivan Calandra ivan.calandra at uni-hamburg.de
Tue Oct 12 12:06:06 CEST 2010


  Oops, my bad.
I rarely do regression, so I forgot that in your case the explaining 
variables do not have to be factors.
The rest stands.
Ivan

Le 10/12/2010 11:56, Ivan Calandra a écrit :
>  Hi,
>
> Your response (dependent) variable, which has to be on the left side 
> of the '~' in the formula, should be numeric. In your example deny is 
> a factor; first problem
> The explaining variables, on the right side of the '~', should be 
> factors. Here, hir, dir, css and mcs are numeric; second problem. Only 
> black is a factor.
>
> There are two possibilities (not mutually exclusive):
> - you should transform your factors into numeric and vice-versa as 
> needed, see ?factor and ?as.numeric, as well as StringAsFactor 
> argument from ?read.table (I guess you imported your data.frame that way)
> - you should adjust your model formula. It might be that you mixed up 
> the variables in the formula. See ?formula
>
> HTH,
> Ivan
>
> Le 10/12/2010 11:39, Gabriel Bergin a écrit :
>> Hi,
>>
>> I am trying to do a multiple regression on the dataset "Hdma", 
>> available in
>> the Ecdat package.
>>
>> The data looks like this:
>>> str(Hdma)
>> 'data.frame': 2381 obs. of  13 variables:
>>   $ dir        : num  0.221 0.265 0.372 0.32 0.36 ...
>>   $ hir        : num  0.221 0.265 0.248 0.25 0.35 ...
>>   $ lvr        : num  0.8 0.922 0.92 0.86 0.6 ...
>>   $ ccs        : num  5 2 1 1 1 1 1 2 2 2 ...
>>   $ mcs        : num  2 2 2 2 1 1 2 2 2 1 ...
>>   $ pbcr       : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>>   $ dmi        : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 1 ...
>>   $ self       : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>>   $ single     : Factor w/ 2 levels "no","yes": 1 2 1 1 1 1 2 1 1 2 ...
>>   $ uria       : num  3.9 3.2 3.2 4.3 3.2 ...
>>   $ comdominiom: num  0 0 0 0 0 0 1 0 0 0 ...
>>   $ black      : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>>   $ deny       : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 1 ...
>>
>> I would like to try a more complex regression, but even this relatively
>> uncomplicated one returns an error:
>>
>> summary(lm(deny ~ hir + dir + ccs + mcs + black))
>>
>> The error I get is:
>> Error in storage.mode(y)<- "double" :
>>    invalid to change the storage mode of a factor
>> In addition: Warning message:
>> In model.response(mf, "numeric") :
>>    using type="numeric" with a factor response will be ignored
>>
>> I understand that there is something wrong due to the fact that some 
>> of the
>> variables are factors. But as far as I've grasped, it should be 
>> possible to
>> include factor variables when using lm(). Am I in error in thinking 
>> this?
>>
>> Sincerely,
>> Gabriel Bergin
>> Undergraduate economics student
>>
>>     [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra at uni-hamburg.de

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php



More information about the R-help mailing list