[R] identifying when one element of a row has a positive number

Joshua Wiley jwiley.psych at gmail.com
Thu Jan 27 11:10:07 CET 2011


Hi,

This problem seemed deceptively simple to me.  After chasing a
considerable number of dead ends, I came up with fg().  It lacks the
elegance of Dennis' solution, but (particularly for large datasets),
it is substantially faster.  I still feel like I'm missing something,
but....

###############################################
## Data
df1 <- data.frame(x = seq(1860,1950,by=10),
  y = seq(-290,-200,by=10), ANN = c(3,0,0,0,1,0,1,1,0,0),
  CTA = c(0,1,0,0,0,0,1,0,0,2), GLM = c(0,0,2,0,0,0,0,1,0,0))
## larger test dataset
dftest <- do.call("rbind", rep(list(df1), 100))


f <- function(x) ifelse(sum(x > 0) == 1L, names(which(x > 0)), NA)
g <- function(x) ifelse(sum(x > 0) == 2L, names(which(x == 0L)), NA)

fg <- function(dat) {
  cnames <- colnames(dat)
  dat <- dat > 0; z <- rowSums(dat)
  z1 <- z == 1L; z2 <- z == 2L; rm(z)
  output <- matrix(NA, nrow = nrow(dat), ncol = 2)
  output[z1, 1] <- apply(dat[z1, ], 1, function(x) cnames[x])
  output[z2, 2] <- apply(dat[z2, ], 1, function(x) cnames[!x])
  return(output)
}

## Compare times on larger dataset
system.time(cbind(apply(dftest[, 3:5], 1, f),
  apply(dftest[, 3:5], 1, g)))
system.time(fg(dftest[, 3:5]))

## compare times under repetitions
system.time(for (i in 1:100) cbind(apply(df1[, 3:5], 1, f),
  apply(df1[, 3:5], 1, g)))
system.time(for (i in 1:100) fg(df1[, 3:5]))
###############################################

Josh


On Thu, Jan 27, 2011 at 12:36 AM, Dennis Murphy <djmuser at gmail.com> wrote:
> Hi:
>
> Try this:
>
> f <- function(x) ifelse(sum(x > 0) == 1L, names(which(x > 0)), NA)
> g <- function(x) ifelse(sum(x > 0) == 2L, names(which(x == 0L)), NA)
>> apply(df1[, 3:5], 1, f)
>  [1] "ANN" "CTA" "GLM" NA    "ANN" NA    NA    NA    NA    "CTA"
>> apply(df1[, 3:5], 1, g)
>  [1] NA    NA    NA    NA    NA    NA    "GLM" "CTA" NA    NA
>
> HTH,
> Dennis
>
> On Wed, Jan 26, 2011 at 9:36 PM, Daisy Englert Duursma <
> daisy.duursma at gmail.com> wrote:
>
>> Hello,
>>
>> I am not sure where to begin with this problem or what to search for
>> in r-help. I just don't know what to call this.
>>
>> If I have 5 columns, the first 2 are the x,y, locations and the last
>> three are variables about those locations.
>>
>> x<-seq(1860,1950,by=10)
>> y<-seq(-290,-200,by=10)
>> ANN<-c(3,0,0,0,1,0,1,1,0,0)
>> CTA<-c(0,1,0,0,0,0,1,0,0,2)
>> GLM<-c(0,0,2,0,0,0,0,1,0,0)
>> df1<-as.data.frame(cbind(x,y,ANN,CTA,GLM))
>>
>> What I would like to produce is an additional column that tells when
>> only 1 of the three variables has a value greater than 0. I would like
>> this new column to give the name of the variable. Likewise, I would
>> like a column that tells one only one of the three variables for a
>> given row has a value of 0. For my example the new columns would be:
>>
>> one_presence<-c("ANN","CTA","GLM","NA","ANN","NA","NA","NA","NA","CTA")
>> one_absence<-c("NA","NA","NA","NA","NA","NA","GLM","CTA","NA","NA")
>>
>> The end result should look like
>>
>> df2<-(cbind(df1,one_presence,one_absence))
>>
>> I am sure I can do this with a loop or maybe grep but I am out of ideas.
>>
>> Any help would be appreciated.
>>
>> Cheers,
>> Daisy
>>
>> --
>> Daisy Englert Duursma
>>
>> Room E8C156
>> Dept. Biological Sciences
>> Macquarie University  NSW  2109
>> Australia
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list