[R] find and

Ashta sewashm at gmail.com
Sat Mar 18 23:22:20 CET 2017


Thank you Rudi and  Ulrik.

Rudi, your option worked for the small data set but when I applied to
the big data set it taking long and never finished and have to kill
it. I dont know why.


Ulrik's option worked fine for the big data set  (> 1.5M  records)
and took less than 2 minutes.

These two are giving me the same  results.
# Counting unique
DF4 %>%    group_by(city) %>%     filter(length(unique(var)) == 1)
# Counting not duplicated
DF4 %>%    group_by(city) %>%    filter(sum(!duplicated(var)) == 1)

 Thank yo again.


On Sat, Mar 18, 2017 at 10:40 AM, Ulrik Stervbo <ulrik.stervbo at gmail.com> wrote:
> Using dplyr:
>
> library(dplyr)
>
> # Counting unique
> DF4 %>%
>   group_by(city) %>%
>   filter(length(unique(var)) == 1)
>
> # Counting not duplicated
> DF4 %>%
>   group_by(city) %>%
>   filter(sum(!duplicated(var)) == 1)
>
> HTH
> Ulrik
>
>
> On Sat, 18 Mar 2017 at 15:17 Rui Barradas <ruipbarradas at sapo.pt> wrote:
>>
>> Hello,
>>
>> I believe this does it.
>>
>>
>> sp <- split(DF4, DF4$city)
>> want <- do.call(rbind, lapply(sp, function(x)
>>                 if(length(unique(x$var)) == 1) x else NULL))
>> rownames(want) <- NULL
>> want
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 18-03-2017 13:51, Ashta escreveu:
>> > Hi all,
>> >
>> > I am trying to find a city that do not have the same "var" value.
>> > Within city the var should be the same otherwise exclude the city from
>> > the final data set.
>> > Here is my sample data and my attempt. City1 and city4 should be
>> > excluded.
>> >
>> > DF4 <- read.table(header=TRUE, text=' city  wk var
>> > city1  1  x
>> > city1  2  -
>> > city1  3  x
>> > city2  1  x
>> > city2  2  x
>> > city2  3  x
>> > city2  4  x
>> > city3  1  x
>> > city3  2  x
>> > city3  3  x
>> > city3  4  x
>> > city4  1  x
>> > city4  2  x
>> > city4  3  y
>> > city4  4  y
>> > city5  3  -
>> > city5  4  -')
>> >
>> > my attempt
>> >       test2  <-   data.table(DF4, key="city,var")
>> >       ID1    <-   test2[ !duplicated(test2),]
>> >      dps     <-   ID1$city[duplicated(ID1$city)]
>> >     Ddup  <-   which(test2$city %in% dps)
>> >
>> >      if(length(Ddup) !=0)  {
>> >            test2   <-  test2[- Ddup,]  }
>> >
>> > want     <-  data.frame(test2)
>> >
>> >
>> > I want get the following result but I am not getting it.
>> >
>> >     city wk var
>> >    city2  1   x
>> >    city2  2   x
>> >    city2  3   x
>> >    city2  4   x
>> >    city3  1   x
>> >    city3  2   x
>> >   city3  3   x
>> >   city3  4   x
>> >   city5  3   -
>> >   city5  4   -
>> >
>> > Can some help me out the problem is?
>> >
>> > Thank you.
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list