[R] Splitting a categorical variable into multiple variables

Bert Gunter gunter.berton at gene.com
Fri Aug 9 20:09:26 CEST 2013


Actually, I think it's pretty trivial if you do it in a smarter way
than I previously suggested. I found this by reading ?levels (RTFM,
Bert!)

> z <- factor(letters[1:3])
> levels(z)[1:2]<- "d" ## no hardcoding names; just use indices
> z
[1] d d c
Levels: d c

Cheers,
Bert

On Fri, Aug 9, 2013 at 8:45 AM, Claus O'Rourke <claus.orourke at gmail.com> wrote:
> Thanks Bert. I guess I was just wondering if there was a way to create
> the new factors automatically without me having to hard code the level
> names manually in my R code.
>
> Rgds
>
> Claus
>
> On Fri, Aug 9, 2013 at 3:42 PM, Bert Gunter <gunter.berton at gene.com> wrote:
>> ... or if you want to keep the unchanged levels the same:
>>
>> zz <- factor(ifelse( z %in% c("a", "b"),"d" ,levels(z)[z]))
>>
>> -- Bert
>>
>> On Fri, Aug 9, 2013 at 7:35 AM, Bert Gunter <bgunter at gene.com> wrote:
>>> If I understand what you mean, just recode them.
>>>
>>> z <- factor(letters[1:3])
>>> z
>>> zz <- factor(ifelse( z %in% c("a", "b"),"d" ,z))
>>> zz
>>>
>>> Cheers,
>>> Bert
>>>
>>> On Fri, Aug 9, 2013 at 7:10 AM, Claus O'Rourke <claus.orourke at gmail.com> wrote:
>>>> Hello R-Help,
>>>> I have a variable with > 32 levels and I'd like to split this into two
>>>> variables such that both new variables have >= 32 variables. This is
>>>> to handle the limit of 32 level predictor variables in R's Random
>>>> Forest implementation. Might someone be able to suggest an elegant way
>>>> to do this? I've tried googling for this, but haven't hit the right
>>>> search terms.
>>>>
>>>> Regards
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>> --
>>>
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>>
>>> Internal Contact Info:
>>> Phone: 467-7374
>>> Website:
>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list