[BioC] Selecting genes for machine learning

Fri Jun 24 18:17:31 CEST 2011

Filtering by variance is certainly an acceptable way to go.

Sean

On Fri, Jun 24, 2011 at 10:27 AM, January Weiner
<january.weiner at mpiib-berlin.mpg.de> wrote:
> Dear all,
>
> what is currently regarded as the optimal strategy to select genes for
> machine learning analysis? Taking all of the 40k or so genes is not
> doable (at least with randomForest, which I use). "Bioconductor case
> studies" suggests using nsFilter with argument var.cutoff=0.75,
> however I am not sure how that is calculated. Are the genes sorted
> according to absolute variance? If yes, is that method really suitable
> for filtering "uninteresting" genes?
>
> Kind regards,
>
> January
>
> --
> -------- Dr. January Weiner 3 --------------------------------------
> Max Planck Institute for Infection Biology
> Charitéplatz 1
> D-10117 Berlin, Germany
> Web   : www.mpiib-berlin.mpg.de
> Tel     : +49-30-28460514
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>