hist()

Peter Dalgaard BSA p.dalgaard@biostat.ku.dk
16 Nov 1998 14:26:38 +0100


Going over my old notes, I realised that hist() has changed since the
earlier versions of R, in that the intervals are now
left-open,right-closed rather than the opposite. This is a change in
the direction of S-plus compatibility, but I wonder how sensible it
really is.

The main problem is with ages, where you'd naturally take age 17 as
representing something between 17 and 18, but:

> brk<-c(15,16,17,18)
> print(hist(17,breaks=brk,plot=F))
$breaks
[1] 15 16 17 18

$counts
[1] 0 1 0

$intensities
[1] 0 1 0

$mids
[1] 15.5 16.5 17.5

so a 17-yo gets put in the 16-17 bracket..

The workaround is to add a small number to the data or subtract it
from the breakpoints, but I still wonder whether the behaviour
shouldn't be changed (generally, or with an option). R's hist() has a
couple of improvements over Splus already, particularly a different
default in the case of non-equidistant breaks.

[hist() returning an invisible result even with plot=F is also a bug,
but that's more easily fixed]

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._