[R] hist{graphics}

Duncan Mackay du|c@|m@ @end|ng |rom b|gpond@com
Mon Jul 15 02:32:58 CEST 2019


Also there is 

@ARTICLE{JCGS18-0021,
  author = {Denby, L. and Mallows, C.},
  year = {2009},
  title = {Variations on the histogram},
  journal = {Journal of Computational and Graphical Statistics},
  volume = {18},
  number = {1},
  pages = {21--31},
  doi = {10.1198/jcgs.2009.0002},
  abstract = {When constructing a histogram, it is common to make all bars the same
	width. One could also choose to make them all have the same area.
	These two options have complementary strengths and weaknesses; the
	equal-width histogram oversmooths in regions of high density, and
	is poor at identifying sharp peaks; the equal-area histogram oversmooths
	in regions of low density, and so does not identify outliers. We
	describe a compromise approach which avoids both of these defects.
	We regard the histogram as an exploratory device, rather than as
	an estimate of a density. We argue that relying on the asymptotics
	of integrated mean squared error leads to inappropriate recommendations
	for choosing bin-widths. Datasets and R codes are available in the
	online supplements.},
  keywords = {diagonally-cut histogram; equal-area histogram; asymptotics;
	IMSE.},
}

I have not looked at the site for a while but I think it has some code in ?Splus which should work in R.
This follows a report in the same name which appears to be no longer available at the original site which has code 

Regards

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2350

-----Original Message-----
From: R-help [mailto:r-help-bounces using r-project.org] On Behalf Of peter dalgaard
Sent: Sunday, 14 July 2019 02:15
To: Duncan Murdoch
Cc: r-help using r-project.org; Steven
Subject: Re: [R] hist{graphics}

Also checkout MASS::truehist or simply consider setting breaks so as not to coincide with data values. (hist() not doing something like this, but instead actively aiming for pretty breaks is something of a design bug in my book, but ancient history and not easy to change at this point in time.)

-pd

> On 13 Jul 2019, at 11:29 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
> 
> On 12/07/2019 11:38 a.m., Steven wrote:
>> Never mind. Thanks.
>> I found that adding parameter right=F to the call fixes it.
> 
> Drawing a histogram of discrete data often leads to bad results. Histograms are intended for continuous data, where no observations fall on bin boundaries.
> 
> You often get a more faithful representation of discrete data using something like
> 
> plot(table(x))
> 
> Duncan Murdoch
> 
>> On 2019/7/12 下午 05:10, Steven wrote:
>>> # Can someone help with this simple frequency histogram problem (n = 15)?
>>> # I use four class limits: [90,95], [95,100], [100,105], [105,110].
>>> # These coincide with the limits obtain by pretty {base}.
>>> # Proper frequencies would be: (1,5,6,3).
>>> # But hist{graphics} gives me a histogram showing frequencies (1,8,3,3),
>>> # with or without argument break = ...
>>> # Replicable codes below. Thanks.
>>> 
>>> set.seed(123)
>>> x<-rnorm(15,mean=100,sd=5); x<-as.integer(x)
>>> x<-sort(x)
>>> x
>>> breaks<-seq(90,110,by=5); breaks
>>> pretty(x,n=5) # pretty {base}
>>> x.cut<-cut(x,breaks,right=F) ; x.cut
>>> freq<-table(x.cut); cbind(freq)
>>> hist(x,breaks=breaks) # hist {graphics}
>>> hist(x)
>>> 
>>> 
>>> 
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list