# [R] Putting regression lines on SPLOM

Deepayan Sarkar deepayan at stat.wisc.edu
Fri Sep 5 04:55:54 CEST 2003

```On Thursday 04 September 2003 19:52, Ted Harding wrote:
> Thanks, Deepayan!
> However, for me this has deepened the mystery (I don't really
> understand in detail how lattice graphics works anyway!).
>
> To clarify: The variables X,Y,Z,W in DF have some zero values,
> and otherwise are positive. For U,V in X,Y,Z,W I plot log(1+V)
> against log(1+U) for all the points. But I regress log(V) on
> log(U) using only those points where both U and V are positive
> (for these data the difference between log(U) and log(1+U) is
> small when U>0, and has little effect on the plot; but I want
> the regression to be as stated). Can this be incorporated into
> the framework you suggest below?

Sure. The only lattice/trellis 'style' issue relevant here is that the panel
function is supposed to work with the x and y vectors supplied to it (and is
not designed to easily access any extraneous information) --- other than
that, it's as flexible as R allows you to be.

In this case, things are slightly complicated by the fact that axis limits
cannot be controlled in splom except as determined by the range of the data
supplied. This means the data frame you supply has to be in the log(1+U)
form, and the regression framed accordingly. So,

splom(log(1 + DF),
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
ok <- (x > 0) & (y > 0)
fm <- lm(  log(exp(y[ok]) - 1) ~  log(exp(x[ok]) - 1) )
panel.abline(fm, ...)
})

Does this give you what you want ?

If it makes more sense, I can send you a modified panel.pairs which would
allow something like

splom(DF,
prepanel.limits = function(x) extend.limits(range(log(1 + x))),
panel = function(x, y, ...) {
panel.xyplot(log(1 + x), log(1 + y), ...)
ok <- (x > 0) & (y > 0)
fm <- lm(  log(y[ok]) ~  log(x[ok]) )
panel.abline(fm, ...)
})

Deepayan

>
> Thanks!
> Ted.
>
> On 04-Sep-03 Deepayan Sarkar wrote:
> > You can't do it in that sequence, and whether you can do it at all
> > depends on exactly what you mean when you say that the data used for
> > the regressions are not the same as those used for the plots. The
> > typical way would be to do
> >
> > splom(DF,
> >       panel = function(x, y, ...) {
> >           panel.xyplot(x, y, ...)
> >
> >           # modify x and y as appropriate (?)
> >           # whether that can be done depends on whether
> >           # you have all the information you need
> >           # available inside the panel function
> >
> >           fm <- lm(y ~ x)
> >           panel.abline(fm)
> >       })
> >
> > Can't think of anything else (other than using a custom superpanel
> > function).
> >
> > Deepayan
> >
> > On Thursday 04 September 2003 11:47 am, Ted Harding wrote:
> >> Sorry Folks,
> >> I'm sure I could suss out the answer myself but I need it
> >> soon ... !
> >>
> >> 1. Given a set of 4 variables X,Y,Z,W in a dataframe DF, I make
> >>    a scatter-plot matrix using splom(DF).
> >>
> >> 2. I do all regressions of U on V using lm(U~V), where U and V
> >>    are all 12 different ordered pairs from X,Y,Z,W.
> >>
> >> 3. Now I would like to superpose the regression lines from (2)
> >>    onto the corresponding panels from (1).
> >>
> >> (By the way, the data used for the regressions are not quite
> >>  the same as those used for the plots, since a few observations
> >>  are omitted from the regressions but appear on  the plots,
> >>  so (1) and (2) really are separate operations).
> >>
> >> With thanks,
> >> Ted.

```