[BioC] genefilter - construct your own test

James W. MacDonald jmacdon at med.umich.edu
Tue Sep 5 15:38:46 CEST 2006


Hi Lina,

Lina Hultin-Rosenberg wrote:
> Dear list!
> 
> I have a question on the package genefilter. Reading the R documentation on
> genefilter I understand that the user can construct his/her own tests but I
> don't really understand how. 
> 
> "This package uses a very simple but powerful protocol for filtering genes.
> The user simply constructs any number of tests that they want to apply. A
> test is simply a function (as constructed using one of the many helper
> functions in this package) that returns TRUE if the gene of interest passes
> the test (or filter) and FALSE if the gene of interest fails." 
> 
> Is it possible to construct your own tests for use in genefilter and how is
> that done? I would like to filter genes on absent/present calls from the
> mas5calls method, perhaps in combination with other filters, so it would be
> very convenient to include that test in genefilter.
> 
> Maybe someone has experience from constructing there own filters and could
> point me in the right direction. I would greatly appreciate some help!

It's actually very simple. An example would be the kOverA() function 
that already exists in genefilter:

 > kOverA
function (k, A = 100, na.rm = TRUE)
{
     function(x) {
         if (na.rm)
             x <- x[!is.na(x)]
         sum(x > A) >= k
     }
}

So let's say you want to select probesets based on having a 'present' 
call in n or more samples. You could set up your filter function like this:

mascallsfilter <- function(cutoff = "p", number){
   function(x){
     ## use tolower() to normalize inputs
     sum(tolower(x) == tolower(cutoff)) >= number
   }
}

Note that there are two functions here, one nested in the other. The 
outer function takes the arguments to filter on, and the inner one takes 
an argument 'x', which will be your matrix of calls. You can then use 
this function like any other in the genefilter package:

 > f1 <- mascallsfilter("p", 5)
 > filt <- filterfun(f1)
 > a <- matrix(sample(c("P","M","A"), 100, TRUE), nc=5)
 > a
       [,1] [,2] [,3] [,4] [,5]
  [1,] "M"  "P"  "A"  "M"  "M"
  [2,] "A"  "P"  "P"  "A"  "A"
  [3,] "M"  "A"  "A"  "P"  "P"
  [4,] "M"  "M"  "M"  "A"  "M"
  [5,] "A"  "P"  "P"  "P"  "P"
  [6,] "M"  "P"  "M"  "M"  "M"
  [7,] "A"  "M"  "P"  "M"  "A"
  [8,] "A"  "P"  "P"  "P"  "A"
  [9,] "A"  "M"  "P"  "M"  "A"
[10,] "P"  "A"  "M"  "A"  "M"
[11,] "P"  "M"  "A"  "A"  "A"
[12,] "M"  "A"  "A"  "M"  "P"
[13,] "P"  "A"  "A"  "A"  "A"
[14,] "M"  "A"  "A"  "A"  "M"
[15,] "A"  "A"  "A"  "A"  "A"
[16,] "A"  "A"  "P"  "M"  "M"
[17,] "M"  "M"  "P"  "A"  "M"
[18,] "M"  "P"  "M"  "P"  "M"
[19,] "P"  "P"  "P"  "A"  "A"
[20,] "M"  "M"  "P"  "A"  "M"
 > genefilter(a, filt)
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
FALSE FALSE FALSE FALSE
[16] FALSE FALSE FALSE FALSE FALSE

HTH,

Jim


> 
> Thank you!
> 
> Sincerely
> 
> Lina Hultin Rosenberg
> 
> ________________________________
> Lina Hultin Rosenberg
> Msc Molecular Biotechnology
> Evolutionary Biology Department
> Uppsala University
> Norbyvägen 18
> 752 36 Uppsala
> Phone: +46-18-4716444
> Email: lina.hultin.rosenberg at ebc.uu.se
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list