[R] creating dummy variables

David Winsemius dwinsemius at comcast.net
Sun Apr 21 01:17:25 CEST 2013


On Apr 20, 2013, at 3:56 PM, Eva Prieto Castro wrote:

> 
> Hi,
> 
> Why do you write that dummy variables are not needed in R?. I would like you explain it.

I suppose you might want individual instruction, but Rhelp was established with certain principles (expressed in the Posting Guide), one of which is that persons posting to Rhelp should have made demonstrated effort on their own to study the offered documentation. You are not demonstrating that you have yet understood this principle.

-- 
David.
> 
> Thanks, 
> 
> Eva
> 
> --- El dom, 21/4/13, David Winsemius <dwinsemius at comcast.net> escribió:
> 
> De: David Winsemius <dwinsemius at comcast.net>
> Asunto: Re: [R] creating dummy variables
> Para: "Bert Gunter" <gunter.berton at gene.com>
> CC: "r-help at R-project.org" <r-help at r-project.org>, "shyam basnet" <shyamabc2002 at yahoo.com>
> Fecha: domingo, 21 de abril, 2013 00:38
> 
> 
> On Apr 20, 2013, at 2:03 PM, Bert Gunter wrote:
> 
> > Dummy variables are not needed in R.
> > 
> > Bert
> > 
> 
> Bert is correct on this point, but if you what to know how the regression functions in R do this "behind the scenes" then you could always look at:
> 
> ?model.matrix     # where _some_ of the the automagical stuff happens
> 
> > model.matrix( ~ crop, data=dat[,"crop", drop=FALSE])
>   (Intercept) cropSWHE
> 1           1        1
> 2           1        1
> 3           1        0
> 4           1        0
> 5           1        1
> 6           1        1
> 7           1        0
> 8           1        0
> attr(,"assign")
> [1] 0 1
> attr(,"contrasts")
> attr(,"contrasts")$crop
> [1] "contr.treatment"
> 
> 
> 
> > Sent from my iPhone -- please excuse typos.
> > 
> > On Apr 20, 2013, at 11:23 AM, shyam basnet <shyamabc2002 at yahoo.com> wrote:
> > 
> >> Hello R-users,
> >> 
> >> The below is a snippet of my data:
> >> 
> >> 
> >> fid  crop  year  value   
> >> 5_1_1  SWHE  1995  171   
> >> 5_1_1  SWHE  1997  696   
> >> 5_1_1  BARL  1996  114   
> >> 5_1_1  BARL  1997  344   
> >> 5_2_2  SWHE  1995  120   
> >> 5_2_2  SWHE  1996  511   
> >> 5_2_2  BARL  1996  239   
> >> 5_2_2  BARL  1997  349   
> >> 
> >> Here, I want to create dummy variables with the names of the content of a column 'crop' in a way that the new variable 'SWHE' would receive a value of 1 if the column 'crop' contains 'SWHE' and 0 otherwise. So, I would have two new variables SWHE and BARL as below:
> >> 
> >> 
> >> fid  crop  year  value  SWHE  BARL   
> >> 5_1_1  SWHE  1995  171  1  0   
> >> 5_1_1  SWHE  1997  696  1  0   
> >> 5_1_1  BARL  1996  114  0  1   
> >> 5_1_1  BARL  1997  344  0  1   
> >> 5_2_2  SWHE  1995  120  1  0   
> >> 5_2_2  SWHE  1996  511  1  0   
> >> 5_2_2  BARL  1996  239  0  1   
> >> 5_2_2  BARL  1997  349  0  1   
> >> 
> >> 
> >> Cheers,
> >> Shyam
> >> Nepal
> >> 
> 
> David Winsemius
> Alameda, CA, USA
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list