[BioC] subsetting the genes for cluster

Thu Sep 4 21:06:28 CEST 2008

On Thu, Sep 4, 2008 at 10:59 AM, Abhilash Venu <abhivenu at gmail.com> wrote:
> On Thu, Sep 4, 2008 at 5:21 AM, Mark Cowley <m.cowley at garvan.org.au> wrote:
>
>> Hi Abhilash,
>>
>> On 02/09/2008, at 11:09 PM, Abhilash Venu wrote:
>>
>>  Hi all,
>>>
>>> I am working on a single color expression data using limma. I would like
>>> to
>>> perform a cluster analysis after selecting the differentially genes based
>>> on
>>> the P value (say 0.001). As far as my knowledge is concerned I have to do
>>> the sub setting of these selected genes on the normalized data (MA), to
>>> retrieve the distribution across the samples.
>>>
>> That's correct
>
>
>
>> Thank you Mark, But I am quite cinfused here. Because our colaborator has
>> already performed single color in agilent platform, when I had performed
>> cluster using the same method as I mentioned the color key has given
>> positive values (as all the values are positive, if I chose values from MA).
>> Our collaborator feels that this scenario is quite unusual because the green
>> color usually represents down regulation. Could you suggest, how I should go
>> about it?

Did you use heatmap.2 to do the heatmap?  If so, there is an argument
"scale" that might be useful.  For ALL functions that are new, I would
advise reading the whole help page, as there is often very useful
information there.

>>>
>>> But I am wondering whether I can perform using the R script?
>>>
>> Can you elaborate on "using the R script"I was not sure about the R script
>> for subsetting, so I performed using python.

You can try help.search('subset'), as a start.  RSiteSearch is also
useful for searching for answers.

You will likely benefit from reading:

http://cran.r-project.org/doc/manuals/R-intro.html

And potentially from:

http://biostat-09.berkeley.edu/~bullard/courses/T-berkeley-08/resources/R_intro_easy.pdf

>>>
>>> I would appreciate any help.
>>>
>> You need 2 things: the names of the DE genes, and the normalised data.
>> Get the DE genes from your toptable, and the normalised data from within
>> your MA object (hint: names(MA) ).
>> Then sub-set the normalised data to just those rows from the DE genes, then
>> perform cluster analysis. There are large number of ways of doing this. To
>> get you started, have a look at heatmap.2 from the package gplots.
>> others include the built in
>> hclust( dist( yourDEdata ) )
>>
>> cheers,
>> Mark
>>
>> -----------------------------------------------------
>> Mark Cowley, BSc (Bioinformatics)(Hons)
>>
>> Peter Wills Bioinformatics Centre
>> Garvan Institute of Medical Research, Sydney, Australia
>> -----------------------------------------------------
>>
>>
>
>
> --
>
> Regards,
> Abhilash
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>