[R] reduce three columns to one with the colnames

William Dunlap wdunlap at tibco.com
Mon May 13 18:13:30 CEST 2013


If the dataset is large you may prefer to process it by column instead of by row.  E.g.,

   > m <- matrix(0, nrow=1e6, ncol=3, dimnames=list(NULL,c("Red","Green","Blue"))) 
   > m[cbind(seq_len(nrow(m)), sample(ncol(m), size=nrow(m), replace=TRUE))] <- 1
   > head(d)
     Red Green Blue
   1   0     0    1
   2   0     1    0
   3   1     0    0
   4   0     0    1
   5   0     1    0
   6   0     0    1
   > system.time(byRow <- colnames(d)[apply(d, 1, function(x)which(x==1))]) 
      user  system elapsed 
     73.81    0.19   74.64 
   > system.time(byCol <- with(d, ifelse(Red==1, "Red", ifelse(Green==1, "Green", "Blue"))))
      user  system elapsed 
      0.85    0.00    1.00 
   > identical(byRow, byCol)
   [1] TRUE

Also, you ought to add checks that the data looks like what you think it does
   stopifnot(all(as.matrix(d) %in% c(0, 1)), all(rowSums(d)==1))
or both of the above methods will silently give misleading results.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of arun
> Sent: Monday, May 13, 2013 8:37 AM
> To: studerov at gmail.com
> Cc: R help
> Subject: Re: [R] reduce three columns to one with the colnames
> 
> HI,
> May be:
> dat1<- read.table(text="
> male female transsexuals
> 0 1 0
> 1 0 0
> 0 0 1
> 0 1 0
> 1 0 0
> 1 0 0
> 0 1 0
> ",sep="",header=TRUE)
> 
>  dat1$sex<-colnames(dat1)[apply(dat1,1,function(x) which(x==1))]
>  dat1
> #  male female transsexuals          sex
> #1    0      1            0       female
> #2    1      0            0         male
> #3    0      0            1 transsexuals
> #4    0      1            0       female
> #5    1      0            0         male
> #6    1      0            0         male
> #7    0      1            0       female
> 
> 
> A.K.
> 
> 
> 
> ----- Original Message -----
> From: David Studer <studerov at gmail.com>
> To: Bert Gunter <gunter.berton at gene.com>
> Cc: r-help at r-project.org
> Sent: Monday, May 13, 2013 11:22 AM
> Subject: Re: [R] reduce three columns to one with the colnames
> 
> OK, seems like nobody understood my question ;-)
> 
> Let's make another example:
> 
> I have three variables:
> data$male and data$female and data$transsexuals
> 
> All the three of them contain the values 0 and 1.
> 
> Now I'd like to create another variable data$sex. Now in all cases where
> data$female==1 the variable data$sex should be set to 'female', all in all
> cases
> where data$male==1 the variable data$sex should be set to 'male' and so
> on...
> 
> Thank you!
> 
> David
> 
> 
> 
> 
> 2013/5/13 Bert Gunter <gunter.berton at gene.com>
> 
> > No -- my answer is wrong. I'll leave it to others to correct. Obvious
> > question to OP: What if more than one of your colors variables
> > simultaneously have a 1?
> >
> > -- Bert
> >
> > On Mon, May 13, 2013 at 8:09 AM, Bert Gunter <bgunter at gene.com> wrote:
> > > Cute answer, Pascal. It may even be the answer to the question the OP
> > > should have asked, but I don't think it answered the question that was
> > > asked. That might be:
> > >
> > > c("red"[red], "green"[green], "blue"[blue])
> > >
> > > Cheers,
> > > Bert
> > >
> > > On Mon, May 13, 2013 at 7:36 AM, Pascal Oettli <kridox at ymail.com> wrote:
> > >> Hi,
> > >>
> > >> ?rgb
> > >>
> > >> HTH
> > >> Pascal
> > >>
> > >>
> > >> 2013/5/13 David Studer <studerov at gmail.com>
> > >>
> > >>> Hello everybody,
> > >>>
> > >>> I have three variables "blue", "green" and "red" containing values 0
> > (no)
> > >>> and 1 (yes).
> > >>>
> > >>> How can I easily create another variable "colors" with the values
> > "blue",
> > >>> "green" and "red"?
> > >>>
> > >>> I hope that you can understand my question and appreciate any
> > solutions or
> > >>> hints!
> > >>>
> > >>> Thank you!
> > >>> David
> > >>>
> > >>>         [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> R-help at r-project.org mailing list
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>
> > >>         [[alternative HTML version deleted]]
> > >>
> > >> ______________________________________________
> > >> R-help at r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >
> > >
> > >
> > > --
> > >
> > > Bert Gunter
> > > Genentech Nonclinical Biostatistics
> > >
> > > Internal Contact Info:
> > > Phone: 467-7374
> > > Website:
> > >
> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
> >
> >
> >
> > --
> >
> > Bert Gunter
> > Genentech Nonclinical Biostatistics
> >
> > Internal Contact Info:
> > Phone: 467-7374
> > Website:
> >
> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
> >
> 
>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list