[R] Deduping in R by multiple variables

ramoss ramine.mossadegh at finra.org
Wed Aug 29 22:57:52 CEST 2012


I have a dataset w/ 184K obs & 16 variables.  In SAS I proc sort nodupkey it
in seconds by 11 variables.
I tried to do the same thing in R using both the unique & then the
!duplicated functions but it just hangs there & I get no output.  Does
anyone know how to solve this?

This is how I tried to do it in R:


detail3 <-
[!duplicated(c(detail2$TDATE,detail2$FIRM,detail2$CM,detail2$BRANCH,
                             detail2$BEGTIME,
detail2$ENDTIME,detail2$OTYPE,detail2$OCOND,
                             detail2$ACCTYP
,detail2$OSIDE,detail2$SHARES,detail2$STOCKS,
                             detail2$STKFUL)),]

detail3 <-
unique(detail2[,c(detail2$TDATE,detail2$FIRM,detail2$CM,detail2$BRANCH,
          detail2$BEGTIME, detail2$ENDTIME,detail2$OTYPE,detail2$OCOND,
          detail2$ACCTYP ,detail2$OSIDE,detail2$SHARES,detail2$STOCKS,
          detail2$STKFUL)])




Thanks in advance



--
View this message in context: http://r.789695.n4.nabble.com/Deduping-in-R-by-multiple-variables-tp4641778.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list