[R] data frame formatting

boB Rudis bob at rudis.net
Tue Aug 18 21:27:08 CEST 2015


Here's one way in base R:

df <- data.frame(id=c("A","A","B","B"),
                 first=c("BX",NA,NA,"LF"),
                 second=c(NA,"TD","BZ",NA),
                 third=c(NA,NA,"RB","BT"),
                 fourth=c("LG","QR",NA,NA))


new_df <- data.frame(do.call(rbind, by(df, df$id, function(x) {
  sapply(x[,-1], function(y) {
    if (all(is.na(y))) return(NA)
    if (all(!is.na(y))) return("clash")
    return(as.character(y[which(!is.na(y))]))
  })
})))

new_df$id <- rownames(new_df)
rownames(new_df) <- NULL

new_df

##   first second third fourth id
## 1    BX     TD  <NA>  clash  A
## 2    LF     BZ clash   <NA>  B


On Tue, Aug 18, 2015 at 3:06 PM, Jon BR <jonsleepy at gmail.com> wrote:
> df <-
> data.frame(id=c("A","A","B","B"),first=c("BX",NA,NA,"LF"),second=c(NA,"TD","BZ",NA),third=c(NA,NA,"RB","BT"),fourth=c("LG","QR",NA,NA))
>> df



More information about the R-help mailing list