[R] Duplicate rows when I combine two data.frames with merge!

RKinzer ryank at nezperce.org
Mon Feb 6 21:29:53 CET 2012


Hello all,

First I have done extensive searches on this forum and others and nothing
seems to work.  So I decided to post thinking someone could point me to the
write post or give me some help.

I have drawn a 100 samples from a fictitious population (N=1000), and then
randomly selected 25% of the 100 samples.  I would like to now merge the
data.frame from the 100 samples with the data.frame for the 25 individuals
from the sample.  When I do this with the following code I get duplicate
rows, when I should have at most is 100.

x<-mapply(rnorm,1000,c(54,78,89),c(3.5,5.5,5.9))  #sets up 1000 random
numbers for age 3,4,5
x.3<-sample(x[,1],60)  #randomly selects 60 lengths from age 3
x.4<-sample(x[,2],740)
x.5<-sample(x[,3],200)
length<-c(x.3,x.4,x.5)  
length<-round(length,digits=0)  #rounds lengths to whole number
age3<-rep(3,60) 
age4<-rep(4,740)
age5<-rep(5,200)
age<-c(age3,age4,age5)  #combines ages into one vector
unique<-1:1000  #gives each fish a unique id
pop<-data.frame(unique,length,age) 
pop<-pop[sample(1:1000,size=1000,replace=FALSE),]  #randomized the order of
pop
c.one<-pop[sample(1:1000,size=100,replace=TRUE),] 
a.one.qtr<-c.one[sample(1:100,size=25,replace=TRUE),] 
merge<-merge(c.one,a.one.qtr,by="unique",all=TRUE)

What I would ultimately like to have is one row for all 100 in the sample
and three columns (unique, length, age).  And then some way to identify the
25 individual selected rows.

Thank you upfront for any help.  I have been stuck for days.

Ryan



--
View this message in context: http://r.789695.n4.nabble.com/Duplicate-rows-when-I-combine-two-data-frames-with-merge-tp4362685p4362685.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list