[R] using match to obtain non-sorted index values from non-sortedvector

Folkes, Michael Michael.Folkes at dfo-mpo.gc.ca
Wed Jul 9 22:13:20 CEST 2014


So nice! 
Apply wins again.
Thanks David.
Michael

-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu] 
Sent: July-09-14 1:11 PM
To: Folkes, Michael; r-help at r-project.org
Subject: RE: using match to obtain non-sorted index values from
non-sortedvector

There may be a faster way, but 

> sapply(Tset, function(x) which(pop.df$pop==x))
[1] 5 4 2

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Folkes, Michael
Sent: Wednesday, July 9, 2014 2:58 PM
To: r-help at r-project.org
Subject: [R] using match to obtain non-sorted index values from
non-sorted vector

Hello all,

I've been struggling with the best way to find index values from a large
vector with elements that will match elements of a subset vector [the
table argument in match()]. 

BUT the index values can't come out sorted (as we'd get in  which(X %in%
Y) ).

My 'population' vector can't be sorted. 

pop.df <- data.frame(pop=c(1,6,4,3,10)) 

The subset:  Tset = c(10,3,6)



So I'd like to get these index values (from pop.df) , in this order:
5,4,2



If it could be sorted I could use:

which(sort(pop.df$pop) %in% sort(Tset))



But sorting will cause more grief later, so best not mess with it.

Here is my hopefully adequate MWE of a solution. I'm keen to see if
anybody has a better suggestion. 

Thanks!

_____________________

###BEGIN R

#pop is the full set of values, it has no info on their ranking

# I don't want to sort these data. They need to remain in this order.

pop.df <- data.frame(pop=c(1,6,4,3,10))



#rank.df is my dataframe that tells me the top three rankings (derived
elsewhere)

rank.df <- data.frame(rank=1:3, Tset = c(10,3,6))   # Target set



#match.df will be my source of row index based on rank

match.df <- data.frame(match.vec= match(pop.df$pop, table=rank.df$Tset),
index.vec=1:nrow(pop.df))



#rank.df will now include the index location in the pop.df where I can
find the top three ranks.

rank.df  <- merge(rank.df, match.df, by.x='rank', by.y='match.vec')

rank.df



####END



_______________________________________________________

Michael Folkes

Salmon Stock Assessment

Canadian Dept. of Fisheries & Oceans     

Pacific Biological Station

3190 Hammond Bay Rd.

Nanaimo, B.C., Canada

V9T-6N7

Ph (250) 756-7264 Fax (250) 756-7053  Michael.Folkes at dfo-mpo.gc.ca
<mailto:Michael.Folkes at dfo-mpo.gc.ca> 




	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list