[Rd] improved pairs.formula?

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Mar 29 09:17:31 CEST 2005


On Tue, 29 Mar 2005, Berwin A Turlach wrote:

> Dear all,
>
> I would like to suggest changing the pairs.formula command such that a
> command like
>
>        pairs(GNP ~ . - Year - GNP.deflator, longley)
>
> would behave in a similar fashion as
>
>        lm(GNP ~ . - Year - GNP.deflator, longley)
>
> i.e., make a pairwise scatterplot of GNP and all other variables in
> the (longley) dataframe except for Year and GNP.deflator.  The above
> command, with the current version of pairs.formula, produces a
> pairwise scatterplot of all variables in the (longley) dataframe.
>
> After some tinkering around, I came up with the following replacement
> function of pairs.formula which seems to do the job:
>
>    pairs.formula <-
>    function (formula, data = NULL, ..., subset, na.action = stats::na.pass)
>    {
>      m <- match.call(expand.dots = FALSE)
>      if (is.matrix(eval(m$data, parent.frame())))
>        m$data <- as.data.frame(data)
>      m$... <- NULL
>      m$na.action <- stats::na.pass
>      m[[1]] <- as.name("model.frame")
>      mf <- eval(m, parent.frame())
>
>      mt <- attr(mf, "terms")
>      tmp <- attr(mt, "factors")
>      ind <- apply(tmp, 1, max)
>      sv <- rownames(tmp)[ind>0]
>      ind <- match(sv,
>                   sapply(attr(mt, "variables"), deparse, width.cutoff=500)[-1])
>      if( (tt <- attr(mt, "response")) != 0 ){
>        ind <- c(tt, ind)
>      }
>
>      pairs(mf[,ind], ...)
>    }
>
> Would you please kindly consider replacing the current pairs.formula
> function (at the top of the file src/library/graphics/R/pairs.R) with
> the above function?

Perhaps you could explain what precisely 'the job' is and why you chose 
such an unusual piece of code to do it?  (E.g. what is the prescription 
for the ordering of terms, and why do you think the rownames of the 
factors and the variables might be in different orders?  They are set the 
same in the C code.)

BTW, the help page specifically warns against a formula of the type you 
specified.  Why do you want to allow a response?  Currently only '+' is 
documented to work.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list