[R] How to combine conditional argument and logical argument in R to create subset of data...

arun smartpink111 at yahoo.com
Wed Mar 6 22:29:26 CET 2013



Hi,
How about this:

indxTem1<-paste0(Tem1[,1],Tem1[,2])
 indxTem2<-paste0(Tem2[,1],Tem2[,2])
Tem1[!indxTem1%in%indxTem2,]
#       V1 V2
 #[1,] 333 11
 #[2,] 111 16
 #[3,] 111 17
 #[4,] 111 20
 #[5,] 222 21
 #[6,] 222 22
 #[7,] 222 23
 #[8,] 222  1
 #[9,] 222  2
#[10,] 333  3
#[11,] 333  4
#[12,] 333  5
#[13,] 333  6
#[14,] 333  7


A.K.
________________________________
From: HJ YAN <yhj204 at googlemail.com>
To: arun <smartpink111 at yahoo.com> 
Cc: r-help at r-project.org 
Sent: Wednesday, March 6, 2013 4:09 PM
Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data...


Dear Arun


Thanks a million for your prompt reply and I love all four ways in your reply. 

Tried the code and just realised an issue here:   in my real work, my data is about 4GB large and I'm sure that there are many duplicated values in V2, so that is to say my V1 and V2 should be something like


V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data index with lots of repeated numeric values
V2<-c(1:23, 1:7)  # there are also duplicated values in V2
Tem1<-cbind(V1,V2)
Tem2<-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1...


So how do I get outcome of the difference of Tem1 and Tem2 if the values in V2 having duplicates?

  V1 V2
 333 11
 111 16
 111 17
 111 20
 222 21
 222 22
 222 23
 222  1
 222  2
 333  3
 333  4
 333  5
 333  6
 333  7


Massive thanks
HJ





On Wed, Mar 6, 2013 at 4:12 PM, arun <smartpink111 at yahoo.com> wrote:


>
>Just to add:
>
>Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),]
>
>A.K.
>
>----- Original Message -----
>
>From: arun <smartpink111 at yahoo.com>
>To: HJ YAN <yhj204 at googlemail.com>
>Cc: R help <r-help at r-project.org>
>Sent: Wednesday, March 6, 2013 11:06 AM
>Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
>
>Hi,
>No problem.
>V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
> length(V1)
>#[1] 30
>
> V2<- c(1:30) #should be the same length as V1
>Tem1<- cbind(V1,V2)
>Tem2<-Tem1[1:20,]
>
>Tem1[!Tem1[,2]%in%Tem2[,2],]
> #      V1 V2
> #[1,] 222 21
> #[2,] 222 22
> #[3,] 222 23
> #[4,] 222 24
> #[5,] 222 25
> #[6,] 333 26
> #[7,] 333 27
> #[8,] 333 28
> #[9,] 333 29
>#[10,] 333 30
>
>#or
>subset(Tem1,!V2%in% Tem2[,2])
>#or
> Tem1[is.na(match(Tem1[,2],Tem2[,2])),]
> #      V1 V2
> #[1,] 222 21
> #[2,] 222 22
> #[3,] 222 23
> #[4,] 222 24
> #[5,] 222 25
> #[6,] 333 26
> #[7,] 333 27
> #[8,] 333 28
> #[9,] 333 29
>#[10,] 333 30
>A.K.
>
>
>
>
>________________________________
>From: HJ YAN <yhj204 at googlemail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Wednesday, March 6, 2013 10:33 AM
>Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
>
>
>Thank you SO MUCH Arun!!! 
>
>That's brilliant-- I've learnt some very useful new R command now, e.g. 'do.call' and 'split'. And I see where my code went wrong now. 
>
> I do appreciate greatly for your prompt reply.
>
>Also, I wonder if there exist a package can find difference between two data frames, e.g. one is a subset of the other? e.g. 
>
> V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
> V2<-c(1:23)
>Tem1<-cbind(V1,V2)
>
>Tem2<-Tem1[1:20,]
>
>
>How do I get outcome like 
>
>[21,] 333 21
>[22,] 333 22
>[23,] 333 23
>
>
>P.S. I used 'setdiff' before, but seems it only works for vectors but not for dataframe??
>
>
>Sorry for so many questions today, as I'm coding for a work deadline tonight.
>
>
>Many thanks!
>Cheers
>HJ
>
>
>
>
>
>
>
>On Wed, Mar 6, 2013 at 1:55 PM, arun <smartpink111 at yahoo.com> wrote:
>
>Hi,
>>You can also try this:
>> Tem3<- list()
>> for(i in unique(Tem1[,1])) {
>> Tem3[[i]]<- subset(Tem1,Tem1[,1]==i)
>> Tem4<- do.call(rbind,Tem3)
>> }
>>head(Tem4)
>>#      V1 V2
>>#[1,] 111  1
>>#[2,] 111  2
>>#[3,] 111  3
>>#[4,] 111  4
>>#[5,] 111 13
>>#[6,] 111 14
>>
>>
>>#or
>>Tem3<-c(NA,NA)
>> for(i in unique(Tem1[,1])) {
>> Tem2<- subset(Tem1, Tem1[,1]==i)
>> Tem3<- rbind(Tem3,Tem2)
>> Tem5<- Tem3[-1,]
>> }
>>head(Tem5)
>>#  V1 V2
>># 111  1
>># 111  2
>># 111  3
>># 111  4
>># 111 13
>># 111 14
>>
>>A.K.
>>
>>
>>________________________________
>>From: HJ YAN <yhj204 at googlemail.com>
>>
>>To: arun <smartpink111 at yahoo.com>
>>Cc: r-help at r-project.org
>>Sent: Wednesday, March 6, 2013 8:24 AM
>>Subject: Re: [R] How to combine conditional argument and logical argument in R to create subset of data...
>>
>>
>>
>>Hi Arun
>>
>>
>>Thank you so much for the help, that's really helpful!!
>>
>>Also I have a quick question about the code below where I can not see why it doesn't work...
>>
>>I know the I shou
>>
>>V1<-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
>>V2<-c(1:23)
>>Tem1<-cbind(V1,V2)
>>
>>
>>So Tem 1 looks like...
>>> Tem1
>>       V1 V2
>> [1,] 111  1
>> [2,] 111  2
>> [3,] 111  3
>> [4,] 111  4
>> [5,] 222  5
>> [6,] 222  6
>> [7,] 222  7
>> [8,] 222  8
>> [9,] 333  9
>>[10,] 333 10
>>[11,] 333 11
>>[12,] 333 12
>>[13,] 111 13
>>[14,] 111 14
>>[15,] 111 15
>>[16,] 111 16
>>[17,] 222 17
>>[18,] 222 18
>>[19,] 222 19
>>[20,] 222 20
>>[21,] 333 21
>>[22,] 333 22
>>[23,] 333 23
>>
>>I would like the outcome to be...
>>
>>      V1 V2
>>
>>     111  1
>>     111  2
>>     111  3
>>     111  4
>>     111 13
>>     111 14
>>     111 15
>>     111 16
>>     222  5
>>     222  6
>>     222  7
>>     222  8
>>     222 17
>>     222 18
>>     222 19
>>     222 20
>>     333  9
>>     333 10
>>     333 11
>>     333 12
>>     333 21
>>     333 22
>>     333 23
>>
>>
>>So I tried code as below 
>>------------------------------------------
>>Tem3<-c(NA,NA)
>>for(i in length(unique(Tem1[,1]))){
>>Tem2<-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i])
>>Tem3<-rbind(Tem3,Tem2)
>>Tem3
>>}
>>Tem4<-Tem3[-1,]
>>---------------------------------------
>>
>>And only get this...
>>
>>
>> V1 V2
>> 333  9
>> 333 10
>> 333 11
>> 333 12
>> 333 21
>> 333 22
>> 333 23
>>
>>
>>I tried to run the code step by step, e.g. letting i=1, then i=2, then i= 3, and updating my Tem3, I did get what I wanted, but wondered why in the loop above it did not work...??
>>
>>
>>Many thanks in advance!
>>
>>HJ
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>On Wed, Mar 6, 2013 at 4:36 AM, arun <smartpink111 at yahoo.com> wrote:
>>
>>Hi,
>>>
>>> b[b[,4]>15 & (b[,1]>4|is.na(b[,1])) & (b[,2]>4|is.na(b[,2])),]
>>> #    [,1] [,2] [,3] [,4] [,5]
>>>#[1,]    6   NA   NA   16   20
>>>#[2,]   NA    5   NA   17   21
>>>A.K.
>>>
>>>
>>>
>>>----- Original Message -----
>>>From: HJ YAN <yhj204 at googlemail.com>
>>>To: r-help at r-project.org
>>>Cc:
>>>Sent: Tuesday, March 5, 2013 9:33 PM
>>>Subject: [R] How to combine conditional argument and logical argument in R to create subset of data...
>>>
>>>Dear R user
>>>
>>>I have data created using code below
>>>
>>>b<-matrix(2:21,nrow=4)
>>>b[,1:3]=NA
>>>b[4,2]=5
>>>b[3,1]=6
>>>
>>>Now the data is
>>>
>>>> b
>>>         [,1]  [,2]   [,3]  [,4]  [,5]
>>>[1,]   NA   NA   NA   14   18
>>>[2,]   NA   NA   NA   15   19
>>>[3,]      6   NA   NA   16   20
>>>[4,]   NA    5     NA    17   21
>>>
>>>
>>>I want to keep data in column 4 greater than 15 and the value in column 1 &
>>>2 either greater than 4 or is 'NA'. So I would like to have
>>>my outcome as below...
>>>
>>>[3,]   6   NA NA 16 20
>>>[4,] NA 5 NA 17 21
>>>
>>>I thought something like the code below gonna to work but it only returns
>>>the last row,e.g "NA 5 NA 17 21". ...
>>>
>>>bb<-b[which( (b[,2]>4 | b[,2]==NA) & (b[,1]>4 | b[,1]==NA) & b[,4]>15) ,])
>>>
>>>
>>>Please could anyone help?
>>>
>>>Many thanks in advance
>>>
>>>HJ
>>>
>>>    [[alternative HTML version deleted]]
>>>
>>>______________________________________________
>>>R-help at r-project.org mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>>>
>>>     
>>
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list