[BioC] Filtering genes using a list

James MacDonald jmacdon at med.umich.edu
Wed Aug 18 19:48:32 CEST 2004


Marc,

I think finding the function you are looking for is usually more of an
art than a science. I have no idea how I found %in%, but my usual method
for finding functions that I think probably exist goes like this:

1.) help.search("something that I think might be a reasonable name for
the function")
2.) google it to within an inch of its life ;-D. Usually I prepend an R
on the google search to possibly limit the results to actual R
functions. There are also search pages on www.r-project.org and
www.bioconductor.org that will search the mail list archives. 
3.) Look at code for functions that I already know might do something
similar and see how they do it.

By this time I have usually found what I am looking for, plus a bunch
of other stuff that may come in handy in the future. However, if I still
am hitting a wall, I ask on either the BioC or R-help listserv.

Best,

Jim



James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623

>>> <mcolosim at brandeis.edu> 08/18/04 01:25PM >>>
Jim,

Thanks for the hint about %in%, where did you find this function? I
couldn't
find any thing about it. 

Also, it works the other way:

index <- geneNames(eset) %in% gene_list.tab[,1] 

Marc

Quoting James MacDonald <jmacdon at med.umich.edu>:

> I would use the %in% function. This assumes that your matrix of gene
> values has the gene names appended somehow (row.names, or the first
> column). Since you are doing affy stuff, the easiest way is to use
the
> exprSet holding your data.
> 
> index <- gene_list.tab[,1] %in% geneNames(eset)
> -or-
> index <- gene_list.tab[,1] %in% row.names(my.metric)
> 
> Then subset using the index.
> 
> subset.data <- my.metric[index,]
> 

> 
> >>> <mcolosim at brandeis.edu> 08/18/04 11:29AM >>>
> This probably is a general R question, but I couldn't find anything
> useful. I
> found all sort of stuff on how to filter using functions based on
the
> values
> within the matrix, but nothing like this.
> 
> I have a list of genes in a file that I want to look at, how can I
> filter my
> matrix of genes to match the ones in the list?
> 
> gene_list.tab with 250 genes:
> probe{tab}description
> affy_blah1{tab}affy gene of interest 1
> affy_blah2{tab}affy gene of interest 2
> ..
> 
> dim(my.metric)
> [1] 22625    11
> 
> mmfun <- function() # to filter
> ffun <- filterfun(mmfun)
> my.fmetric <- genefilter(my.metric,ffun)
> dim(my.fmetric) ## This should give 250 and 11
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch 
> https://stat.ethz.ch/mailman/listinfo/bioconductor 
>

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch 
https://stat.ethz.ch/mailman/listinfo/bioconductor



More information about the Bioconductor mailing list