[R] how to collapse categories or re-categorize variables?

Ista Zahn izahn at psych.rochester.edu
Sun Jul 18 00:40:14 CEST 2010


On Sat, Jul 17, 2010 at 9:03 PM, Peter Dalgaard <pdalgd at gmail.com> wrote:
> Ista Zahn wrote:
>> Hi,
>> On Fri, Jul 16, 2010 at 5:18 PM, CC <turtysmail at gmail.com> wrote:
>>> I am sure this is a very basic question:
>>>
>>> I have 600,000 categorical variables in a data.frame - each of which is
>>> classified as "0", "1", or "2"
>>>
>>> What I would like to do is collapse "1" and "2" and leave "0" by itself,
>>> such that after re-categorizing "0" = "0"; "1" = "1" and "2" = "1" --- in
>>> the end I only want "0" and "1" as categories for each of the variables.
>>
>> Something like this should work
>>
>> for (i in names(dat)) {
>> dat[, i]  <- factor(dat[, i], levels = c("0", "1", "2"), labels =
>> c("0", "1", "1))
>> }
>
> Unfortunately, it won't:
>
>> d <- 0:2
>> factor(d, levels=c(0,1,1))
> [1] 0    1    <NA>
> Levels: 0 1 1
> Warning message:
> In `levels<-`(`*tmp*`, value = c("0", "1", "1")) :
>  duplicated levels will not be allowed in factors anymore
>

I stand corrected. Thank you Peter.

>
> This effect, I have been told, goes way back to design choices in S
> (that you can have repeated level names) plus compatibility ever since.
>
> It would make more sense if it behaved like
>
> d <- factor(d); levels(d) <- c(0,1,1)
>
> and maybe, some time in the future, it will. Meanwhile, the above is the
> workaround.
>
> (BTW, if there are 600000 variables, you probably don't want to iterate
> over their names, more likely "for(i in seq_along(dat))...")
>
> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org



More information about the R-help mailing list