[R] help matching rows of a data frame

Eric Berger ericjberger at gmail.com
Mon Sep 18 15:54:29 CEST 2017


Hi Terry,
I take your question to mean how to label distinct rows of a data frame. If
that is not your question please clarify.
I found the row.match() function in the package prodlim that can be used to
solve this.
However since your request requires no additional dependencies I borrowed
the relevant code from the row.match function.
Here is some obfuscated code to provide your answer in one line, per your
request. (less obfuscated code just below that.

Assuming your data frame is called 'df':

df[,ncol(df)+1] <- match( do.call("paste", c(df[, , drop = FALSE], sep =
"\\r")), do.call("paste", c(unique(df)[, , drop = FALSE], sep = "\\r")) )

The last column of df now contains the 'label' i.e. the row number of the
first row in df that is the same as the given row.

Somewhat less obfuscated

getLabels <- function(df) {
                          match( do.call("paste", c(df[, , drop = FALSE],
sep = "\\r")),
                                     do.call("paste", c(unique(df)[, , drop
= FALSE], sep = "\\r")) )
                     }

myDataFrame$label <- getLabels(myDataFrame)


HTH,

Eric


On Mon, Sep 18, 2017 at 3:13 PM, Therneau, Terry M., Ph.D. <
therneau at mayo.edu> wrote:

> This question likely has a 1 line answer, I'm just not seeing it.  (2, 3,
> or 10 lines is fine too.)
>
> For a vector I can do group  <- match(x, unqiue(x)) to get a vector that
> labels each element of x.
> What is an equivalent if x is a data frame?
>
> The result does not have to be fast: the data set will have < 100
> elements.  Since this is inside the survival package, and that package is
> on  the 'recommended' list, I can't depend on any package outside the
> recommended list.
>
> Terry T.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list