[R] inclusion criteria help

Thomas Lumley tlumley at u.washington.edu
Tue Nov 27 18:10:11 CET 2001


On Tue, 27 Nov 2001, Aaron J Mackey wrote:

>
> On Tue, 27 Nov 2001, Thomas Lumley wrote:
>
> > so for many queries you could tapply() or by() this process
> >
> >   filter<-function(this.subset){
> > 	this.subset[this.subset$coverage==max(this.subset$coverage),]
> > 	}
> >   filtered<-by(hits, hits$query,filter)
>
> Thank you, this was the bit of R programming style that I hadn't yet
> learned.  A small followup question regarding the "selection" filter: what
> if there are ties in (maximal) coverage that need to be broken by
> additional logic ?
>
> so that:
>
> filter <- function(subset) { subset[which.max(subset$coverage),] }
>
> becomes:
>
> sort.func <- function(subsect) {
>   # sort by max(coverage), break ties with additional logic ??
> }
> filter <- function(subset) { subset[sort.func(subset),] }
>

Actually, which.max already does break ties -- it finds the first maximum.

If you weren't using which.max you could just return the first best row
with
   filter<-function(this.subset){
      this.subset[this.subset$coverage==max(this.subset$coverage),,drop=FALSE]
      this.subset[1,]
     }
or
  filter<-function(this.subset){
      best<-which(this.subset$coverage==max(this.subset$coverage))[1]
      this.subset[best,]
  }


The drop=FALSE in the first example is needed to make sure that the subset
is still a dataframe whether it has one row or more than one row, allowing
matrix-like indexing in the next line.


	-thomas


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list