# [R] plot of Bernoulli data

Frank E Harrell Jr fharrell at virginia.edu
Thu Oct 4 15:05:59 CEST 2001

```Greg - My model for this type of application is to
either use

- shingle objects with Trellis/Lattice to allow overlapping
intervals (and for this application there is good reason
for intervals to overlap if you are not using the
better loess/lowess solution)
or
- the cut2 function in the soon (I hope) to be beta-released
Hmisc library, with tapply and all the other stratified
computation functions.  cut2 has the same arguments as
cut but also arguments g (number of quantile groups) and
m (number of observations per interval)

cut2 at present does not handle fixed windows as you
describe, but in general such windows are problematic.

The bigger point though is that there may be a slight
advantage in separating the stratification code (e.g.,
cut2) from the computation code (e.g., tapply and
summarize (in Hmisc)).

Frank Harrell

"Warnes, Gregory R" wrote:
>
> In the next release of the gregmisc library there is a function "wapply"
> that applies a specified function over 'windows' of x-y data (code below).
> You would use it in this case as:
>
>  x<-sort(runif(100,1,20))
>  p<-pnorm(x,10,3)
>  y<-as.numeric(runif(x)<p)
>  plot(x,y)
>  lines(x,p)
>
>  # draw a line with 10 points, each computed on regions spanning 1/5th of
> the x range
>  lines(wapply(x,y,n=10,width=1/5),col="red")
>
> -Greg
>
> # \$Id: wapply.R,v 1.2 2001/08/31 23:45:45 warneg Exp \$
> #
> "wapply" _ function( x, y, fun=mean, method="range",
>                     width=1/10, n=50, ...)
> {
>   method <- match.arg(method, c("width","range","nobs","fraction"))
>
>   if(method == "width" || method == "range" )
>     {
>       if(method=="range")
>         width <- width * range(x)
>
>
>       pts _ seq(min(x),max(x),length.out=n)
>
>       result _ sapply( pts, function(pts,y,width,fun,XX,...)
>                       {
>                         low _ min((pts-width/2),max(XX))
>                         high _ max((pts+width/2), min(XX))
>                         return (fun(y[(XX>= low) & (XX<=high)],...))
>                       },
>                       y=y,
>                       width=width,
>                       fun=fun,
>                       XX = x,
>                       ...)
>
>       return(x=pts,y=result)
>     }
>   else # method=="nobs" || method=="fraction"
>     {
>       if( method=="fraction")
>         width <- floor(length(x) * width)
>
>       ord <- order(x)
>       x  <- x[ord]
>       y  <- y[ord]
>
>       n  <- length(x)
>       center  <- 1:n
>       below <- sapply(center - width/2, function(XX) max(1,XX) )
>       above <- sapply(center + width/2, function(XX) min(n,XX) )
>
>       retval  <- list()
>       retval\$x  <- x
>       retval\$y  <- apply(cbind(below,above), 1,
>                          function(x) fun(y[x[1]:x[2]],...) )
>
>       return(retval)
>     }
>
> }
>
>  >  -----Original Message-----
>  >  From: Bill Simpson [mailto:wsi at gcal.ac.uk]
>  >  Sent: Tuesday, October 02, 2001 7:57 AM
>  >  To: r-help at stat.math.ethz.ch
>  >  Subject: Re: [R] plot of Bernoulli data
>  >
>  >
>  >  >  x<-sort(runif(100,1,20))
>  >  >  p<-pnorm(x,10,3)
>  >  >  y<-as.numeric(runif(x)<p)
>  >  >  plot(x,y)
>  >  >  lines(x,p)
>  >  df<-data.frame(x,y)
>  >  aggregate(df,list(x=(x<5),(x>5)&(x<10),(x>10) &
>  >  (x<15),(x>15)), FUN=mean)
>  >  gives me what I want but if anyone has a better way to collect the
>  >  observations into bins I'd like to hear it. It would be nice to
>  >  pass along something like
>  >  breaks<-c(5,10,15,20)
>  >
>  >  Thanks
>  >
>  >  Bill Simpson
>  >
>  >  -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
>  >  -.-.-.-.-.-.-.-.-.-
>  >  r-help mailing list -- Read
>  >  http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
>  >  Send "info", "help", or "[un]subscribe"
>  >  (in the "body", not the subject !)  To:
>  >  r-help-request at stat.math.ethz.ch
>  >  _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
>  >  _._._._._._._._._._
>  >
>
> LEGAL NOTICE
> Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

--
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

```