[R] How to replace all non-maximum values in a row with 0

William Dunlap wdunlap at tibco.com
Fri Apr 9 17:24:35 CEST 2010


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of ONKELINX, Thierry
> Sent: Friday, April 09, 2010 2:25 AM
> To: jessen at econinfo.de; r-help at r-project.org
> Subject: Re: [R] How to replace all non-maximum values in a row with 0
> 
> It can be done faster and more elegant with apply and rowSums
> 
> rows <- 100000
> A <- matrix(rpois(n = rows * 20, lambda = 100), nrow = rows)
> A[4, c(1,3)] <- 1000
> 
> system.time({
> 	y <- t(apply(A, 1, function(z){
> 		1 * (z == max(z))
> 	}))
> 	y[rowSums(y) > 1, ] <- 0
> })

S+ has a rowMaxs() function but base R doesn't appear
to have one.  When ncol(A)<<nrow(A) the following version
of rowMaxs() runs faster than the above call to apply
   rowMaxs <- function(x) {
       retval<-x[,1]
       for(j in seq_len(ncol(x))[-1L]){
           which<-retval<x[,j]
           if(any(which)) retval[which]<-x[which,j]
       }
       retval
   }
so the following function, f, is faster:
   f <- function(A) {
      retval <- rowMaxs(A)==A # relies on column-major order of data in
matrix
      retval[rowSums(retval)>1,] <- 0L
      retval
   }
E.g., for your A I get:
   > system.time(z<-f(A))
      user  system elapsed 
     0.27    0.02    0.30 
   > system.time({
   + y <- t(apply(A, 1, function(z){
   + 1 * (z == max(z))
   + }))
   + y[rowSums(y) > 1, ] <- 0
   + })
      user  system elapsed 
      1.88    0.04    1.84 
   > all.equal(y,z)
   [1] TRUE

This could be sped up more, with some loss of readability,
but my point is that looping over columns instead of rows
can help when there are many more rows than columns.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 
> 
> system.time({
> 	nr <- nrow(A)
> 	nc <- ncol(A)
> 	B <- matrix(0,nrow=nr, ncol=nc)
> 	for(i in 1:nr){
> 		x <- which(A[i,]==max(A[i,]))
> 		B[i,x] <- 1
> 		if(sum(B[i,])>1) B[i,] <- as.vector(rep(0,nc))
> 	}
> })
> all.equal(y, B)
> 
> HTH,
> 
> Thierry
> 
> --------------------------------------------------------------
> ----------
> ----
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> 
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> 
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be
> www.inbo.be
> 
> To call in the statistician after the experiment is done may 
> be no more
> than asking him to perform a post-mortem examination: he may 
> be able to
> say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
> 
> The plural of anecdote is not data.
> ~ Roger Brinner
> 
> The combination of some data and an aching desire for an 
> answer does not
> ensure that a reasonable answer can be extracted from a given body of
> data.
> ~ John Tukey
>   
> 
> > -----Oorspronkelijk bericht-----
> > Van: r-help-bounces at r-project.org 
> > [mailto:r-help-bounces at r-project.org] Namens Owe Jessen
> > Verzonden: vrijdag 9 april 2010 11:08
> > Aan: r-help at r-project.org
> > Onderwerp: Re: [R] How to replace all non-maximum values in a 
> > row with 0
> > 
> > Am 09.04.2010 10:04, schrieb burgundy:
> > > Hi,
> > >
> > > I would like to replace all the max values per row with "1" 
> > and all other
> > > values with "0". If there are two max values, then "0" for 
> > both. Example:
> > >
> > > from:
> > > 2  3  0  0  200
> > > 30 0  0  2  50
> > > 0  0  3  0  0
> > > 0  0  8  8  0
> > >
> > > to:
> > > 0  0  0  0  1
> > > 0  0  0  0  1
> > > 0  0  1  0  0
> > > 0  0  0  0  0
> > >
> > > Thanks!
> > >    
> > Nice little homework to get the day started. :-)
> > 
> > This worked for me, but is probably not the shortest possible answer
> > 
> >    
> > 0,  3,  0,  
> > 0, 0,  0,  8,  8,  0), nrow = 4, byrow=T)
> > nr <- nrow(A)
> > nc <- ncol(A)
> > B <- matrix(0,nrow=nr, ncol=nc)
> > for(i in 1:nr){
> > x <- which(A[i,]==max(A[i,]))
> > B[i,x] <- 1
> > if(sum(B[i,])>1) B[i,] <- as.vector(rep(0,nc))
> > }
> > 
> > -- 
> > Owe Jessen
> > Nettelbeckstr. 5
> > 24105 Kiel
> > post at owejessen.de
> > http://privat.owejessen.de
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> 
> Druk dit bericht a.u.b. niet onnodig af.
> Please do not print this message unnecessarily.
> 
> Dit bericht en eventuele bijlagen geven enkel de visie van de 
> schrijver weer 
> en binden het INBO onder geen enkel beding, zolang dit 
> bericht niet bevestigd is
> door een geldig ondertekend document. The views expressed in  
> this message 
> and any annex are purely those of the writer and may not be 
> regarded as stating 
> an official position of INBO, as long as the message is not 
> confirmed by a duly 
> signed document.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list