[BioC] Classification and the Golub data set

James W. MacDonald jmacdon at med.umich.edu
Sun Apr 2 01:54:31 CEST 2006


Christos Hatzis wrote:
> I have not used this data, but if golubTest is an expression set,
> 
> exprs(golubTest)
> 
> will return the expression values as a data frame.  If this does not work,
> try
> 
> as.matrix(exprs(golubTest))

Another thing to consider is that most 'classical' statistical methods 
expect that the data are in the 'usual' format, with rows as samples and 
columns as observations. With microarray data, the opposite convention 
holds, with columns as samples and rows as observations. Hence you need 
to do:

knn(t(exprs(golubTrain)), t(exprs(golubTest)), cl=golubTrain$ALL.AML, k=3)

Best,

Jim


> 
> -Christos
> 
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Chelsea Ellis
> Sent: Saturday, April 01, 2006 3:51 PM
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] Classification and the Golub data set
> 
> Hi,
> 
> I'm just learning Bioconductor, and I'm trying to do KNN classification
> using the Golub test and training sets with ALL and AML as the classifier.
> 
> When I use the function
> 
> knn(golubTrain, golubTest, cl=golubTrain$ALL.AML, k=3),
> 
> it's says that the lengths of the training set and the classifier don't
> match.  The documentation on KNN says you need to have the test and training
> sets in matrix form, but I'm not sure how to change an expression set into a
> matrix.  I tried "unclass" and "as.matrix" with no luck.  This is probably
> an easy question, but I'm stuck.  Thanks for any help you can give.
> 
> Chelsea
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
James W. MacDonald
University of Michigan
Affymetrix and cDNA Microarray Core
1500 E Medical Center Drive
Ann Arbor MI 48109
734-647-5623



**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list