[BioC] Heatmap.2 scale problems: Sacling inside the function gives different results than scaling outside!!!

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Jul 22 22:04:06 CEST 2010


Hi,

2010/7/22 Elmer Fernández <elmerfer at gmail.com>:
> Hy Benjamin
> Are you sure about that?

Looking at the source code for heatmap.2 (and heatmap, for that
matter) it looks as if Benjamin is correct. The scaling is done after
the clustering.

> If so, I think that it is not correct, right?

I guess it depends on what you were expecting it to do :-)

Having just realized this myself (yikes -- see what happens when we
assume(?)), I think I'd more often rather send in a scaled version of
the data and have scale='none' in the heatmap call, to be honest.

-steve

> best
> Elmer
>
> 2010/7/22 Benjamin Otto <b.otto at uke.uni-hamburg.de>
>
>> Hi Guys,
>>
>> do note that the scale() function in heatmap doesn't scale your values till
>> AFTER clustering for visualization purpose! So if you provide already scaled
>> data, you naturally will expect a different result.
>>
>> cheers
>>
>> Benjamin
>>
>> Am 22.07.2010 um 16:25 schrieb Bazeley, Peter:
>>
>> > Hi Elmer,
>> >
>> > The default scale option in heatmap.2 scales by row, whereas the scale()
>> function scales by column, so this is probably why there is a difference. I
>> think whichever dimension contains unique samples is how you want to scale
>> (if this was expression data, for example).
>> >
>> >
>> > Pete
>> > ________________________________________
>> > From: bioconductor-bounces at stat.math.ethz.ch [
>> bioconductor-bounces at stat.math.ethz.ch] on behalf of Sean Davis [
>> sdavis2 at mail.nih.gov]
>> > Sent: Thursday, July 22, 2010 9:17 AM
>> > To: Elmer Fernández
>> > Cc: Bioconductor mailing list
>> > Subject: Re: [BioC] Heatmap.2 scale problems: Sacling inside the function
>>       gives different results than scaling outside!!!
>> >
>> > 2010/7/22 Elmer Fernández <elmerfer at gmail.com>
>> >
>> >> Dear Users
>> >> I'm working with the heatmap.2 function and I realize that if you use
>> the
>> >> scale input paramenter gives different results than usign the scale
>> >> function
>> >> outsie and feed the heatmap.2 fucntion with the scaled matrix. I
>> attached
>> >> the results of the two approaches and the used data matrix (M.csv).
>> >> SO, what I'm doing wrong?
>> >>
>> >>
>> > Hi, Elmer.
>> >
>> > The default distance function used by heatmap.2 is dist() which is not
>> going
>> > to be invariant under centering and scaling, I don't think.  It looks
>> like
>> > you are using that default.
>> >
>> > Sean
>> >
>> >
>> >> R Code
>> >>
>> >> library(gplots)
>> >> M=matrix(c(rnorm(10*3,1,2),rnorm(10*2,-0.5,1)),ncol=5)
>> >> heatmap.2(M,scale="column",trace="none",main="scaled inside")
>> >> x11();heatmap.2(scale(M),scale="none",trace="none",main="scaled
>> outside")
>> >>
>> >>> sessionInfo()
>> >> R version 2.10.0 (2009-10-26)
>> >> x86_64-unknown-linux-gnu
>> >>
>> >> locale:
>> >> [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C
>> >> LC_TIME=en_US.UTF-8           LC_COLLATE=en_US.UTF-8
>> >> [5] LC_MONETARY=en_US.UTF-8       LC_MESSAGES=en_US.UTF-8
>> >> LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8
>> >> [9] LC_ADDRESS=en_US.UTF-8        LC_TELEPHONE=en_US.UTF-8
>> >> LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8
>> >>
>> >> attached base packages:
>> >> [1] grid      stats     graphics  grDevices utils     datasets  methods
>> >> base
>> >>
>> >> other attached packages:
>> >> [1] gplots_2.7.4   caTools_1.10   bitops_1.0-4.1 gdata_2.7.1
>> >> gtools_2.6.1   rkward_0.5.1
>> >>
>> >> loaded via a namespace (and not attached):
>> >> [1] tools_2.10.0
>> >>
>> >>
>> >> --
>> >> Elmer A. Fernández (Bioing. PhD)
>> >> Investigador Asistente CONICET - Research Assistant CONICET
>> >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC
>> >> tel: +54-(0)351-4938000 int 145
>> >> Fax: +54-(0)351-4938081
>> >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15
>> >> http://sites.google.com/site/biologicaldatamininggroup/Home/
>> >> mail address: Camino Alta Gracia Km 7.1/2- Córdoba-5017-Argentina
>> >>
>> >>
>> >>
>> >> --
>> >> Elmer A. Fernández (Bioing. PhD)
>> >> Investigador Asistente CONICET - Research Assistant CONICET
>> >> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC
>> >> tel: +54-(0)351-4938000 int 145
>> >> Fax: +54-(0)351-4938081
>> >> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15
>> >> http://sites.google.com/site/biologicaldatamininggroup/Home/
>> >> mail address: Camino Alta Gracia Km 7.1/2- Córdoba-5017-Argentina
>> >>
>> >>       [[alternative HTML version deleted]]
>> >>
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at stat.math.ethz.ch
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >>
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at stat.math.ethz.ch
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>>
>> ___________________________________________
>> Benjamin Otto, PhD
>> University Medical Center Hamburg-Eppendorf
>> Institute For Clinical Chemistry / Central Laboratories
>> Campus Forschung N27
>> Martinistr. 52,
>> D-20246 Hamburg
>>
>> Tel.: +49 40 7410 51908
>> Fax.: +49 40 7410 54971
>> ___________________________________________
>>
>>
>>
>>
>>
>> --
>> Pflichtangaben gemäß Gesetz über elektronische Handelsregister und
>> Genossenschaftsregister sowie das Unternehmensregister (EHUG):
>>
>> Universitätsklinikum Hamburg-Eppendorf
>> Körperschaft des öffentlichen Rechts
>> Gerichtsstand: Hamburg
>>
>> Vorstandsmitglieder:
>> Prof. Dr. Jörg F. Debatin (Vorsitzender)
>> Dr. Alexander Kirstein
>> Joachim Prölß
>> Prof. Dr. Dr. Uwe Koch-Gromus
>>
>
>
>
> --
> Elmer A. Fernández (Bioing. PhD)
> Investigador Asistente CONICET - Research Assistant CONICET
> Prof. Inteligencia Artificial -UCC - Prof. Artificial Intelligence @ UCC
> tel: +54-(0)351-4938000 int 145
> Fax: +54-(0)351-4938081
> web page : http://www.uccor.edu.ar/modelo.php?param=3.8.5.15
> http://sites.google.com/site/biologicaldatamininggroup/Home/
> mail address: Camino Alta Gracia Km 7.1/2- Córdoba-5017-Argentina
>
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list