[R] Problem merging data frames and duplicates

Frank Burbrink burbrink666 at gmail.com
Wed May 27 02:12:55 CEST 2015


Hello All,

I am attempting to merge two data frames that naturally contain duplicate
entries, however when using either merge or dMerge I get even more
duplicates.

For example:

data.frame(state=c("IL", "IL", "LA","LA", "MS","MS", "AR", "AR"),
locus=c(1,1,2,2,3,4,5,6))->x

data.frame(state=c("IL", "IL", "AR", "AR", "TN","TN"),
locus=c(1,1,2,3,3,4,4))->y

These yield:

 x
  state locus
1    IL     1
2    IL     1
3    LA     2
4    LA     2
5    MS     3
6    MS     4
7    AR     5
8    AR     6

y
  state locus
1    IL     1
2    IL     1
3    AR     2
4    AR     3
5    TN     3
6    TN     4

However, when merged by "state: they produce another second set of AR and
IL:
merge(x,y,by=1,all=T)->z

   state locus.x locus.y
1     AR       5       2
2     AR       5       3
3     AR       6       2
4     AR       6       3
5     IL       1       1
6     IL       1       1
7     IL       1       1
8     IL       1       1
9     LA       2      NA
10    LA       2      NA
11    MS       3      NA
12    MS       4      NA
13    TN      NA       3
14    TN      NA       4

While, the NAs are desired when states are missing, I don't want the extra
duplicated states and values but rather:

   state locus.x locus.y
1     AR       5       2
4     AR       6       3
7     IL       1       1
8     IL       1       1
9     LA       2      NA
10    LA       2      NA
11    MS       3      NA
12    MS       4      NA
13    TN      NA       3
14    TN      NA       4

Any help would be much appreciated.

Thanks!

Frank

-- 

*************************************
*Frank T. Burbrink, Ph.D.*
*Professor*
*Biology Department*
*6S-143*
*2800 Victory Blvd.*
*College of Staten Island/CUNY*
*Staten Island, New York 10314*
*E-Mail:Frank.Burbrink at csi.cuny.edu <E-Mail%3AFrank.Burbrink at csi.cuny.edu>*
*Phone:718-982-3961*
*Web Page: http://scholar.library.csi.cuny.edu/~fburbrink/
<http://scholar.library.csi.cuny.edu/%7Efburbrink/>*
*************************************
*Chair *
*Ecology, Evolutionary Biology, and Behavior*
*Doctoral Subprogram*
*Biology Program*
*City University of New York *
*Graduate Center*
*365 Fifth Avenue*
*New York, NY 10016-4309*
************************************

	[[alternative HTML version deleted]]



More information about the R-help mailing list