[R] data frames; matching/merging

David Winsemius dwinsemius at comcast.net
Mon Feb 8 17:51:45 CET 2010


On Feb 8, 2010, at 11:39 AM, Jonathan wrote:

> Hi all,
>    I'm feeling a little guilty to ask this question, since I've
> written a solution using a rather clunky for loop that gets the job
> done.  But I'm convinced there must be a faster (and probably more
> elegant) way to accomplish what I'm looking to do (perhaps using the
> "merge" function?).  I figured somebody out there might've already
> figured this out:
>
> I have a dataframe with two columns (let's call them V1 and V2).  All
> rows are unique, although column V1 has several redundant entries.
>
> Ex:
>
>     V1     V2
> 1    a        3
> 2    a        2
> 3    b        9
> 4    c        4
> 5    a        7
> 6    b        11
>
>
> What I'd like is to return a dataframe cut down to have only unique
> entires in V1.  V2 should contain a vector, for each V1, that is the
> minimum of all the possible choices from the set of redundant V1's.

 > rd.txt
function(txt, header=TRUE,...) {
       rd<-read.table(textConnection(txt), header=header, ...)
        closeAllConnections()
       rd}
 > DF <- rd.txt("    V1     V2
+ 1    a        3
+ 2    a        2
+ 3    b        9
+ 4    c        4
+ 5    a        7
+ 6    b        11
+ ")
 > tapply(DF$V2, DF$V1, min)
a b c
2 9 4

 > as.data.frame.table(tapply(DF$V2, DF$V1, min))
   Var1 Freq
1    a    2
2    b    9
3    c    4
 > DF2 <- as.data.frame.table(tapply(DF$V2, DF$V1, min))
 > names(DF2) <- names(DF)
 > DF2
   V1 V2
1  a  2
2  b  9
3  c  4

>
> Example output:
>
>      V1     V2
> 1     a        2
> 2     b        9
> 3     c        4
>
>
> If somebody could (relatively easily) figure out how to get closer to
> a solution, I'd appreciate hearing how.  Also, I'd be interested to
> hear how you came upon the answer (so I can get better at searching
> the R resources myself).
>
> Regards,
> Jonathan
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list