[R] Select Original and Duplicates

arun smartpink111 at yahoo.com
Fri Sep 28 23:01:34 CEST 2012


HI,

You can also try:
idx<-data.frame(t(sapply(df,function(x) !is.na(match(x,x[duplicated(x)])))))
 df1<-df[sapply(idx,function(x) all(x==TRUE)),]
df1
#  label value
#1     A     4
#2     B     3
#4     B     3
#7     A     4
#8     A     4

A.K.

----- Original Message -----
From: Rui Barradas <ruipbarradas at sapo.pt>
To: Adam Gabbert <adamjgabbert at gmail.com>
Cc: r-help at r-project.org
Sent: Friday, September 28, 2012 4:22 PM
Subject: Re: [R] Select Original and Duplicates

Hello,

Try the following.


idx <- duplicated(df) | duplicated(df, fromLast = TRUE)
df[idx, ]

Note that they are returned in their original order in the df.

Hope this helps,

Rui Barradas

Em 28-09-2012 21:11, Adam Gabbert escreveu:
> I would like to select a all the duplicate rows of a data frame including
> the original.  Any help would be much appreciated.  This is where I'm at so
> far. Thanks.
>
> #Sample data frame:
> df <- read.table(header=T, con <- textConnection('
>   label value
>       A     4
>       B     3
>       C     6
>       B     3
>       B     1
>       A     2
>       A     4
>       A     4
> '))
> close(con)
>
> # Duplicate entries
> df[duplicated(df),]
>
> # label value
> #     B     3
> #     A     4
> #     A     4
>
> #I want to select all the rows that are duplicated including the original
> #This is the output I want
> # label value
> #     B     3
> #     B     3
> #     A     4
> #     A     4
> #     A     4
>
>     [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list