[R] Merging files function

Gavin Simpson gavin.simpson at ucl.ac.uk
Thu Jun 1 16:39:18 CEST 2006


On Thu, 2006-06-01 at 09:16 -0400, Chuck Cleland wrote:
> Ahamarshan jn wrote:
> > hi list,
> > 
> > This question must be very basic but I am just 3 days
> > old to R, so I think i can ask. I am trying to find a
> > function to merge two
> > tables of data in two different files as one.
> > 
> >  Does merge function only fills in colums between two
> > table where data is missing or is there a way that
> > merge can be used to merge data between two matrixes
> > with common dimensions.
<snip />

Hi Ahamarshan,

I asked a similar question recently. Chuck's email provides a solution,
to which I'll add a comment and a link to the discussions I had with
Marc Schwartz and Sundar Dorai-Raj.

If your real world use is more complicated than your example, then
you'll need a slightly different strategy. If you have matrices with
different rows, such as, 

# alter Chuck's example to have one df with 5 rows the other with 4
df1 <- as.data.frame(matrix(rnorm(20), ncol=4))
df2 <- as.data.frame(matrix(rnorm(20), ncol=5))
names(df1) <- paste("v", 1:4, sep="")
names(df2) <- paste("x", 1:5, sep="")
row.names(df1) <- paste("h", 1:5, sep="")
row.names(df2) <- paste("h", 1:4, sep="")

Now if you merge this, merge() gives you a result with only 4 rows.

merge(df1, df2, by="row.names")
## lost a row, now with all rows:
merge(df1, df2, by="row.names", all = TRUE)

So use all = TRUE if row sizes differ.

For more complicated merges, you might check out the replies I got from
Marc and Sundar in the following thread:

http://thread.gmane.org/gmane.comp.lang.r.general/63031/focus=63042

Finally, why did you post your message to the list twice, with different
subject lines?

HTH,

G

> 
> See ?merge.
> 
> df1 <- as.data.frame(matrix(rnorm(20), ncol=4))
> df2 <- as.data.frame(matrix(rnorm(20), ncol=4))
> names(df1) <- paste("v", 1:4, sep="")
> names(df2) <- paste("x", 1:4, sep="")
> row.names(df1) <- paste("h", 1:5, sep="")
> row.names(df2) <- paste("h", 1:5, sep="")
> 
> newdf <- merge(df1, df2, by="row.names")

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
*  Note new Address, Telephone & Fax numbers from 6th April 2006  *
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson
ECRC & ENSIS                  [t] +44 (0)20 7679 0522
UCL Department of Geography   [f] +44 (0)20 7679 0565
Pearson Building              [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street                  [w] http://www.ucl.ac.uk/~ucfagls/cv/
London, UK.                   [w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list