[R] histogram question
Frank E. Harrell Jr
fharrell at virginia.edu
Thu Nov 15 12:22:26 CET 2001
The optimum bin widths to use in estimating the underlying
density function need to shrink as the sample size increases.
A histogram with 10 bins for a sample size of 20 would
be almost pure noise.
Frank
Erich Neuwirth wrote:
> thanks for all the help.
>
> one question remains.
> if histogram is meant for continuous data
> (which makes sense)
> why is it changing the defaults of the graphics
> depending on the amount of data,
> and not on the scale of the data.
>
> in both my examples, i had a data vector with numbers ranging from 0 to
> 10,
> once with 1000 elements,
> once with 100000 elements.
>
> this is the same "quality" of data.
> should the graphics defaults not stay consistent with that?
>
>
>
>
>
>
> Ben Bolker wrote:
>
>> The basic problem is that hist() is really designed for continuous data,
>>and you're using it with discrete data. You can either say
>>
>>r <- rbinom(100000,10,0.5)
>>hist(r,10,0.5),col=2,xlim=c(0,10),ylim=c(0,30000),
>> breaks=seq(-0.5,10.5,by=0.1))
>>
>>so that the bins span (-0.5 to 0.5, 0.5 to 1.5, ...)
>>
>>or (arguably better, because it is more sensible with discrete data)
>>
>>barplot(table(r),space=0)
>>
>>On Mon, 12 Nov 2001, Erich Neuwirth wrote:
>>
>>
>>>hist(rbinom(1000,10,0.5),col=2,xlim=c(0,10),ylim=c(0,300))
>>>gives a histogram with "touching bars"
>>>
>>>hist(rbinom(100000,10,0.5),col=2,xlim=c(0,10),ylim=c(0,30000))
>>>gives a histogram with space between the bars.
>>>
>>>is there a way to control the space betweent he bars easily?
>>>
>
> --
> Erich Neuwirth, Computer Supported Didactics Working Group
> Visit our SunSITE at http://sunsite.univie.ac.at
> Phone: +43-1-4277-38624 Fax: +43-1-4277-9386
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
>
--
Frank E Harrell Jr Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list