[BioC] supervised & unsupervised analysis of samples of microarray data

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Apr 26 06:05:45 CEST 2012


Hi,

Before we get into the weeds over supervised vs. unsupervised
learning, I'm curious -- how is your data clustering? Is the
clustering representing more of a technical artifact (batch effect)
vs. the biological "effect" you are trying to see? Like so:

Tackling the widespread and critical impact of batch effects in
high-throughput data
http://www.nature.com/nrg/journal/v11/n10/full/nrg2825.html

Are you trying to do this differential expression/clustering thing as
a QC thing? A "gene signature" thing?

-steve

On Wed, Apr 25, 2012 at 11:26 PM, wenhuo hu <huwenhuo at gmail.com> wrote:
> Hi all,
>
> I am recently analyzing the array data. There are several groups represent
> different disease subtype. I will just describe what I did here. I
> identified significant genes. And extract the expression levels of these
> genes, and performed the cluster analysis using gplots package in
> bioconductor/R. The question I have here is the cluster analysis did not
> group the samples well according the disease subtype. So I assume this is a
> question about supervised and unsupervised cluster. From online data, it
> seems this not really right, because supervised analysis describe more
> likely the way to classify new samples based on previous data. And there
> come with the idea of semi-supervised concept. Here I am already confused.
> Would the analysis methods, such as PAM, SOM, and Kmeans, be supervised or
> semi-supervised clusters? Could anyone spend time to clear my idea about
> supervised, semi-supervised, and unsupervised? And recommend any packages
> in bioconductor that might help me to group the samples according disease
> sub-type?
>
> I like programming, and have biology/medicine background, with relatively
> limited bioinformatics. Any interpretation are welcome.
>
> Thanks!
>
>
> Wenhuo Hu
> Park lab
>
> Memorial Sloan Kettering Cancer Center
> Zuckerman Research Building
> 408 East 69th Street
> Room ZRC-527
> New York, NY 10065
> Phone 646-888-3220
> huw at mskcc.org
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list