[R] Factor variables with GAM models

Corrado ct529 at york.ac.uk
Sat Mar 20 10:19:06 CET 2010


You can some time manually substitute a categorical variable with a set 
of continuous variables.

For example, you have the variables like "landcover.class" with 3 values 
"class A, class B, class C". You cna transform it into 3 continuous 
variables landcover.class.A, landcover.class.B, landcover.class.C and 
assign a value of 1 or 100% for elements belonging to that class or of 0 
for elements not belonging.

That help some time.

Regards

Noah Silverman wrote:
> Steve,
>
> I get that.  What you wrote make sense.
>
> My challenge is the data I'm attempting to model.  Some of the 
> variables are continuous, some are factors.  both linear and poisson 
> models work. (Poisson doing a much more accurate job.)  However, some 
> of the numerical variables are clearly non-linear.  Hence my interest 
> in GAM.  I suppose one alternative would be to try some polynomial 
> transformation on the variable as part of a Poisson model.
>
> Any other suggestions would be welcome.
>
> Thanks!
>
> -N
>
> On 3/19/10 8:37 PM, Steven McKinney wrote:
>> Hi Noah
>>
>> GAM models were developed to assess the functional form
>> of the relationship of continuous predictor variables to the
>> response, so weren't really meant to handle factor variables
>> as predictor variables.  GAMs are of the form
>> E(Y | X1, X2, ...) = So + S(X1) + S(X2) + ...
>> where S(X) is a smooth function of X.
>>
>> Hence you might want to rethink why you'd want a
>> factor variable as a predictor variable in a GAM.
>> This is why the gam machinery doesn't just do the
>> factor conversion to indicator variables as is done in
>> lm.
>>
>> HTH
>>
>> Steven McKinney
>>
>> ________________________________________
>> From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On 
>> Behalf Of Noah Silverman [noah at smartmediacorp.com]
>> Sent: March 19, 2010 12:54 PM
>> To: r-help at r-project.org
>> Subject: [R] Factor variables with GAM models
>>
>> I'm just starting to learn about GAM models.
>>
>> When using the lm function in R, any factors I have in my data set are
>> automatically converted into a series of binomial variables.
>>
>> For example, if I have a data.frame with a column named color and values
>> "red", "green", "blue".   The lm function automatically replaces it with
>> 3 variables colorred, colorgreen, colorblue which are binomial {0,1}
>>
>> When I use the gam function, R doesn't do this so I get an error.
>>
>> 1) Is there a way to ask the gam function to do this conversion for me?
>> 2) If not, is there some other tool or utility to make this data
>> transformation easy?
>> 3) Last option - can I use lm to transform the data and then extract it
>> into a new data.frame to then pass to gam?
>>
>> Thanks!!!
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Corrado Topi
PhD Researcher
Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529 at york.ac.uk



More information about the R-help mailing list