[R] help matching rows of a data frame

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Mon Sep 18 16:11:59 CEST 2017


"Label" is not a clear term for data frames,  but most data frames have rownames. If dta is a data frame, not a tibble, 

rownames( dta )[ !duplicated( dta ) ]

Or could use row indexes directly

which( !duplicated( dta ) )
-- 
Sent from my phone. Please excuse my brevity.

On September 18, 2017 6:54:29 AM PDT, Eric Berger <ericjberger at gmail.com> wrote:
>Hi Terry,
>I take your question to mean how to label distinct rows of a data
>frame. If
>that is not your question please clarify.
>I found the row.match() function in the package prodlim that can be
>used to
>solve this.
>However since your request requires no additional dependencies I
>borrowed
>the relevant code from the row.match function.
>Here is some obfuscated code to provide your answer in one line, per
>your
>request. (less obfuscated code just below that.
>
>Assuming your data frame is called 'df':
>
>df[,ncol(df)+1] <- match( do.call("paste", c(df[, , drop = FALSE], sep
>=
>"\\r")), do.call("paste", c(unique(df)[, , drop = FALSE], sep = "\\r"))
>)
>
>The last column of df now contains the 'label' i.e. the row number of
>the
>first row in df that is the same as the given row.
>
>Somewhat less obfuscated
>
>getLabels <- function(df) {
>                        match( do.call("paste", c(df[, , drop = FALSE],
>sep = "\\r")),
>                                 do.call("paste", c(unique(df)[, , drop
>= FALSE], sep = "\\r")) )
>                     }
>
>myDataFrame$label <- getLabels(myDataFrame)
>
>
>HTH,
>
>Eric
>
>
>On Mon, Sep 18, 2017 at 3:13 PM, Therneau, Terry M., Ph.D. <
>therneau at mayo.edu> wrote:
>
>> This question likely has a 1 line answer, I'm just not seeing it. 
>(2, 3,
>> or 10 lines is fine too.)
>>
>> For a vector I can do group  <- match(x, unqiue(x)) to get a vector
>that
>> labels each element of x.
>> What is an equivalent if x is a data frame?
>>
>> The result does not have to be fast: the data set will have < 100
>> elements.  Since this is inside the survival package, and that
>package is
>> on  the 'recommended' list, I can't depend on any package outside the
>> recommended list.
>>
>> Terry T.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list