[BioC] Generating random gene lists: does sample/resample	generaterandom sets
    Ochsner, Scott A 
    sochsner at bcm.tmc.edu
       
    Wed Sep 10 22:15:57 CEST 2008
    
    
  
Sorry,
Below is my sessionInfo()
> sessionInfo()
R version 2.7.0 (2008-04-22) 
i386-pc-mingw32 
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] splines   tools     stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
 [1] MLInterfaces_1.14.1  annotate_1.18.0      xtable_1.5-2         AnnotationDbi_1.2.1  RSQLite_0.6-8        DBI_0.2-4           
 [7] rda_1.0              rpart_3.1-41         genefilter_1.20.0    survival_2.34-1      MASS_7.2-41          affy_1.18.1         
[13] preprocessCore_1.2.0 affyio_1.8.0         Biobase_2.0.1       
loaded via a namespace (and not attached):
[1] class_7.2-41
 
Scott 
-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Ochsner, Scott A
Sent: Wednesday, September 10, 2008 3:03 PM
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] Generating random gene lists: does sample/resample generaterandom sets
Dear BioC,
I would like feedback as to the appropriateness of the following procedure to produce a set of 1000 random gene lists, each list of length 2000.  The idea is to use the set of random gene lists to assess how often random gene lists of size x can reproduce or improve the classification performance of
myCuratedList.
  
#remove myCuratedList from the universe of possible genes.  The "eset" object is your standard ExpressionSet object.
>length(myCuratedList)
 [1] 2000
>Index<-setdiff(1:length(rownames(exprs(eset))),myCuratedList)
>length(Index)
 [1] 20277
#generate 1000 random gene lists using the genes in Index.  The code for resample is taken from the help pages for sample.
>randomMatrix<-replicate(1000,resample(index,2000))
>dim(randomMatrix)
 [1] 2000 1000
I've verified that each column does not contain repeated genes as should be the case with resample without replacement.
Is there a standard procedure for doing the above or is what I've done kosher?
Scott A. Ochsner, Ph.D.
NURSA Bioinformatics
Molecular and Cellular Biology
Baylor College of Medicine
Houston, TX. 77030
phone: 713-798-6227 
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
    
    
More information about the Bioconductor
mailing list