[R] Help on selecting genes showing highest variance

Juliet Hannah juliet.hannah at gmail.com
Fri Jun 10 15:49:15 CEST 2011


# Let's say your expression data is in a matrix
# named expression in which the rows are genes
# and the columns are samples

myvars <- apply(expression,1, var,na.rm=TRUE)
myvars <- sort(myvars,decreasing=TRUE)
myvars <- myvars[1:200]
expression <- expression[names(myvars),]
dim(expression)


Also check out the genefilter package in bioconductor. You may find
the bioconductor
mailing list is better for questions like this one.


On Tue, Jun 7, 2011 at 9:47 AM, GIS Visitor 33 <gisv33 at gis.a-star.edu.sg> wrote:
> Hi
>
> I have a problem for which I would like to know a solution. I have a gene expression data and I would like to choose only lets say top 200 genes that had the highest expression variance across patients.
>
> How do i do this in R?
>
> I tried x=apply(leukemiadata,1,var)
> x1=x[order(-1*x)]
>
> but the problem here is  x and x1 are numeric data , If I choose the first 200 after sorting in descending, so I do not know how to choose the associated samples with just the numeric values.
>
> Kindly help!
>
>
> Regards
> Ap
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list