[R] please check this

arun smartpink111 at yahoo.com
Mon Jun 10 16:39:26 CEST 2013


Hi,
Try this:
which(duplicated(res10Percent))
# [1] 117 125 157 189 213 235 267 275 278 293 301 327 331 335 339 367 369 371 379
#[20] 413 415 417 441 459 461 477 479 505
res10PercentSub1<-subset(res10Percent[which(duplicated(res10Percent)),],dummy==1)  #most of the duplicated are dummy==1
res10PercentSub0<-subset(res10Percent[which(duplicated(res10Percent)),],dummy==0)
 indx1<-as.numeric(row.names(res10PercentSub1))
indx11<-sort(c(indx1,indx1+1))
indx0<- as.numeric(row.names(res10PercentSub0))
 indx00<- sort(c(indx0,indx0-1))
indx10<- sort(c(indx11,indx00))

 nrow(res10Percent[-indx10,])
#[1] 452
 res10PercentNew<-res10Percent[-indx10,]
 nrow(subset(res10PercentNew,dummy==1))
#[1] 226
 nrow(subset(res10PercentNew,dummy==0))
#[1] 226
 nrow(unique(res10PercentNew))
#[1] 452
A.K.



----- Original Message -----
From: Cecilia Carmo <cecilia.carmo at ua.pt>
To: arun <smartpink111 at yahoo.com>
Cc: 
Sent: Monday, June 10, 2013 10:19 AM
Subject: RE: please check this

But I don't want it like this. 
Once a firm is paired with another, these two firms should not be paired again.
Could you solve this?
Thanks,
Cecília


________________________________________
De: arun [smartpink111 at yahoo.com]
Enviado: segunda-feira, 10 de Junho de 2013 15:12
Para: Cecilia Carmo
Assunto: Re: please check this

I did look into that.
If you look for the nrow() in each category, then it will be different.  It means that the duplicates are not pairwise, but in the whole `result`.  The explanation is again with the multiple matches.  So, here we selected the one with dummy==0 that closely matches the dimension of one dummy==1.  Suppose, the value of dimension with dummy==1` is `2554` and it got a match with dummy==0 with `2580`.  Now, consider another case with dimension as `2570` with dummy==1 (which also comes within the same split group).  Then it got a match with `2580' with dummy==0.  I guess it was based on the way in which it was tested.






________________________________
From: Cecilia Carmo <cecilia.carmo at ua.pt>
To: arun <smartpink111 at yahoo.com>
Sent: Monday, June 10, 2013 10:02 AM
Subject: please check this




When I do

res10Percent<- fun1(final3New,0.1,200)
dim(res10Percent)
[1] 508   5
#[1] 508   5
nrow(subset(res10Percent,dummy==0))
#[1] 254
nrow(subset(res10Percent,dummy==1))
#[1] 254


testingDuplicates<-unique(res10Percent)
nrow(testingDuplicates)
[1] 480 #this should be 508, if not there are duplicated rows, or not?


Thanks
Cecilia



More information about the R-help mailing list