[R] using match to obtain non-sorted index values from non-sortedvector

Folkes, Michael Michael.Folkes at dfo-mpo.gc.ca
Thu Jul 10 00:04:14 CEST 2014


Oh dear,
I seem to have suffered a case of reversed arguments. 
This explains my surprise why R didn't have this in a function already -
as it does!
I was following the pattern of  search.vector %in% pattern, but match()
arguments are opposite this.

Thanks to both Davids.
Michael

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: July-09-14 2:01 PM
To: Folkes, Michael
Cc: David L Carlson; r-help at r-project.org
Subject: Re: [R] using match to obtain non-sorted index values from
non-sortedvector


On Jul 9, 2014, at 1:13 PM, Folkes, Michael wrote:

> So nice! 
> Apply wins again.

I doubt that `sapply( ..., which(,) )` would win a foot race with
`match`:

> match(Tset, pop.df$pop)
[1] 5 4 2

--
David.
> Thanks David.
> Michael
> 
> -----Original Message-----
> From: David L Carlson [mailto:dcarlson at tamu.edu]
> Sent: July-09-14 1:11 PM
> To: Folkes, Michael; r-help at r-project.org
> Subject: RE: using match to obtain non-sorted index values from 
> non-sortedvector
> 
> There may be a faster way, but
> 
>> sapply(Tset, function(x) which(pop.df$pop==x))
> [1] 5 4 2
> 
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
> 
> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org]
> On Behalf Of Folkes, Michael
> Sent: Wednesday, July 9, 2014 2:58 PM
> To: r-help at r-project.org
> Subject: [R] using match to obtain non-sorted index values from 
> non-sorted vector
> 
> Hello all,
> 
> I've been struggling with the best way to find index values from a 
> large vector with elements that will match elements of a subset vector

> [the table argument in match()].
> 
> BUT the index values can't come out sorted (as we'd get in  which(X 
> %in%
> Y) ).
> 
> My 'population' vector can't be sorted. 
> 
> pop.df <- data.frame(pop=c(1,6,4,3,10))
> 
> The subset:  Tset = c(10,3,6)
> 
> 
> 
> So I'd like to get these index values (from pop.df) , in this order:
> 5,4,2
> 
> 
> 
> If it could be sorted I could use:
> 
> which(sort(pop.df$pop) %in% sort(Tset))
> 
> 
> 
> But sorting will cause more grief later, so best not mess with it.
> 
> Here is my hopefully adequate MWE of a solution. I'm keen to see if 
> anybody has a better suggestion.
> 
> Thanks!
> 
> _____________________
> 
> ###BEGIN R
> 
> #pop is the full set of values, it has no info on their ranking
> 
> # I don't want to sort these data. They need to remain in this order.
> 
> pop.df <- data.frame(pop=c(1,6,4,3,10))
> 
> 
> 
> #rank.df is my dataframe that tells me the top three rankings (derived
> elsewhere)
> 
> rank.df <- data.frame(rank=1:3, Tset = c(10,3,6))   # Target set
> 
> 
> 
> #match.df will be my source of row index based on rank
> 
> match.df <- data.frame(match.vec= match(pop.df$pop, 
> table=rank.df$Tset),
> index.vec=1:nrow(pop.df))
> 
> 
> #rank.df will now include the index location in the pop.df where I can

> find the top three ranks.
> 
> rank.df  <- merge(rank.df, match.df, by.x='rank', by.y='match.vec')
> 
> rank.df
> 
> 
> ####END
> 
> 
> 
> _______________________________________________________
> 
> Michael Folkes
> 
> Salmon Stock Assessment
> 


David Winsemius
Alameda, CA, USA



More information about the R-help mailing list