[BioC] Selecting genes for machine learning

January Weiner january.weiner at mpiib-berlin.mpg.de
Fri Jun 24 16:27:00 CEST 2011


Dear all,

what is currently regarded as the optimal strategy to select genes for
machine learning analysis? Taking all of the 40k or so genes is not
doable (at least with randomForest, which I use). "Bioconductor case
studies" suggests using nsFilter with argument var.cutoff=0.75,
however I am not sure how that is calculated. Are the genes sorted
according to absolute variance? If yes, is that method really suitable
for filtering "uninteresting" genes?

Kind regards,

January

-- 
-------- Dr. January Weiner 3 --------------------------------------
Max Planck Institute for Infection Biology
Charitéplatz 1
D-10117 Berlin, Germany
Web   : www.mpiib-berlin.mpg.de
Tel     : +49-30-28460514



More information about the Bioconductor mailing list