[R] using match to obtain non-sorted index values from non-sortedvector

Folkes, Michael Michael.Folkes at dfo-mpo.gc.ca
Wed Jul 9 22:13:20 CEST 2014

So nice! 
Apply wins again.
Thanks David.

-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu] 
Sent: July-09-14 1:11 PM
To: Folkes, Michael; r-help at r-project.org
Subject: RE: using match to obtain non-sorted index values from

There may be a faster way, but 

> sapply(Tset, function(x) which(pop.df$pop==x))
[1] 5 4 2

David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Folkes, Michael
Sent: Wednesday, July 9, 2014 2:58 PM
To: r-help at r-project.org
Subject: [R] using match to obtain non-sorted index values from
non-sorted vector

Hello all,

I've been struggling with the best way to find index values from a large
vector with elements that will match elements of a subset vector [the
table argument in match()]. 

BUT the index values can't come out sorted (as we'd get in  which(X %in%
Y) ).

My 'population' vector can't be sorted. 

pop.df <- data.frame(pop=c(1,6,4,3,10)) 

The subset:  Tset = c(10,3,6)

So I'd like to get these index values (from pop.df) , in this order:

If it could be sorted I could use:

which(sort(pop.df$pop) %in% sort(Tset))

But sorting will cause more grief later, so best not mess with it.

Here is my hopefully adequate MWE of a solution. I'm keen to see if
anybody has a better suggestion. 




#pop is the full set of values, it has no info on their ranking

# I don't want to sort these data. They need to remain in this order.

pop.df <- data.frame(pop=c(1,6,4,3,10))

#rank.df is my dataframe that tells me the top three rankings (derived

rank.df <- data.frame(rank=1:3, Tset = c(10,3,6))   # Target set

#match.df will be my source of row index based on rank

match.df <- data.frame(match.vec= match(pop.df$pop, table=rank.df$Tset),

#rank.df will now include the index location in the pop.df where I can
find the top three ranks.

rank.df  <- merge(rank.df, match.df, by.x='rank', by.y='match.vec')




Michael Folkes

Salmon Stock Assessment

Canadian Dept. of Fisheries & Oceans     

Pacific Biological Station

3190 Hammond Bay Rd.

Nanaimo, B.C., Canada


Ph (250) 756-7264 Fax (250) 756-7053  Michael.Folkes at dfo-mpo.gc.ca
<mailto:Michael.Folkes at dfo-mpo.gc.ca> 

	[[alternative HTML version deleted]]

R-help at r-project.org mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list