[R] duplicated() with conditional statement

arun smartpink111 at yahoo.com
Fri Jul 26 03:26:32 CEST 2013


Hi,
Sorry,`indx` should be:
indx<-which(tt$response=="buy") #I changed indx but forgot about it
 tt$newcolumn<-0
  tt[unlist(lapply(seq_along(indx),function(i) {x1<-if(indx[i]==nrow(tt)) indx[i] else seq(indx[i]+1,indx[i+1]-1);x2<-rbind(tt[indx[1:i],],tt[x1,]); if(any(x2$response=="sample")) row.names(x2[duplicated(x2$product),])})),"newcolumn"]<-1
 tt
   subj response product newcolumn
1     1   sample       1         0
2     1   sample       2         0
3     1      buy       3         0
4     2   sample       2         0
5     2      buy       2         0
6     3   sample       3         1
7     3   sample       2         1
8     3      buy       1         0
9     4   sample       1         1
10    4      buy       4         0
A.K.







________________________________
From: vanessa van der vaart <vanessa.vaart at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Thursday, July 25, 2013 8:55 PM
Subject: Re: duplicated() with conditional statement



hii thanks for the code, I tried the code but i got the error message,
Error in from:to : NA/NaN argument


I dont know what doest it mean, and I dont know how to fix it..
could you help please..

thank you very much in advance





On Thu, Jul 25, 2013 at 10:52 PM, arun <smartpink111 at yahoo.com> wrote:

Hi,
>You may try this (didn't get time to test this extensively)
>indx<-which(tt$response!="buy")
>tt$newcolumn<-0
> tt[unlist(lapply(seq_along(indx),function(i) {x1<-if(indx[i]==nrow(tt)) indx[i] else seq(indx[i]+1,indx[i+1]-1);x2<-rbind(tt[indx[1:i],],tt[x1,]); if(any(x2$response=="sample")) row.names(x2[duplicated(x2$product),])})),"newcolumn"]<-1
> tt
>#   subj response product newcolumn
>#1     1   sample       1         0
>#2     1   sample       2         0
>#3     1      buy       3         0
>#4     2   sample       2         0
>#5     2      buy       2         0
>#6     3   sample       3         1
>#7     3   sample       2         1
>#8     3      buy       1         0
>#9     4   sample       1         1
>#10    4      buy       4         0
>A.K.
>
>
>
>
>
>
>
>
>________________________________
>From: vanessa van der vaart <vanessa.vaart at gmail.com>
>To: smartpink111 at yahoo.com
>Sent: Thursday, July 25, 2013 3:49 PM
>Subject: Re: duplicated() with conditional statement
>
>
>
>
>thank you for the reply.
>It based on entire data set. 
>
>subj response product newcolumn
>1     1   sample       1          0         
>2     1   sample       2          0         
>3     1      buy          3           0       
>4     2   sample       2          0        . 
>5     2         buy       2           0
>6     3   sample       3          1
>7     3   sample       2           1
>8     3         buy       1           0
>9     4  sample       1            1
>10   4       buy       4             0
>
>I am sorry i didnt question it very clearly, let me change the conditional statement, I hope you can understand. i will explain by example
>
>as you can see, almost every number is duplicated, but only in row 6th,7th,and 9th the value on column is 1.
>
>on row4th, the value is duplicated( 2 already occurred on 2nd row),but since the value is considered as duplicated only if the value is duplicated where the response is 'buy' than the value on column, on row4th still zero. 
>
>On row 6th, where the value product column is 3. 3 is already occurred in 3rd row where the value on response is 'buy', so the value on column should be 1
>
>I hope it can understand the conditional statement. 
>
>
>
>
>
>On Thu, Jul 25, 2013 at 8:25 PM, <smartpink111 at yahoo.com> wrote:
>
>Hi,
>>May be I understand it incorrectly.
>>Your new column value doesn't correspond to your conditional statement.  Also, is this duplication based on entire dataset or within "subj".
>><quote author='misseb'>
>>Hi everybody,,
>>I have a question about R function duplicated(). I have spent days try to
>>figure this out,but I cant find any solution yet. I hope somebody can help
>>me..
>>this is my data:
>>
>>subj=c(1,1,1,2,2,3,3,3,4,4)
>>response=c('sample','sample','buy','sample','buy','sample','sample','buy','sample','buy')
>>product=c(1,2,3,2,2,3,2,1,1,4)
>>tt=data.frame(subj, response, product)
>>
>>the data look like this:
>>
>> subj response product
>>1     1   sample       1
>>2     1   sample       2
>>3     1      buy          3
>>4     2   sample       2
>>5     2         buy       2
>>6     3   sample       3
>>7     3   sample       2
>>8     3         buy       1
>>9     4  sample       1
>>10   4       buy        4
>>
>>
>>
>>I want to create new  column based on the value on response and product
>>column. if the value on product is duplicated, then  the value on new column
>>is 1, otherwise is 0.
>>but I want to add conditional statement that the value on product column
>>will only be considered as duplicated if the value on response column is
>>'buy'.
>>for illustration, the table should look like this:
>>
>>subj response product newcolumn
>>1     1   sample       1          0
>>2     1   sample       2          0
>>3     1      buy          3           0
>>4     2   sample       2          0
>>5     2         buy       2           0
>>6     3   sample       3          1
>>7     3   sample       2           1
>>8     3         buy       1           0
>>9     4  sample       1            1
>>10   4       buy       4             0
>>
>>
>>can somebody help me?
>>any help will be appreciated.
>>I am new in this mailing list, so forgive me in advance, If I did not  ask
>>the question appropriately.
>>
>>
>>
>>
>>
>></quote>
>>Quoted from:
>>http://r.789695.n4.nabble.com/duplicated-with-conditional-statement-tp4672342.html
>>
>>
>>_____________________________________
>>Sent from http://r.789695.n4.nabble.com
>>
>>
>



More information about the R-help mailing list