[Rd] delete.response leaves response in attribute dataClasses

Paul Johnson pauljohn32 at gmail.com
Fri Jan 6 20:17:01 CET 2012


Thanks, Bill

Counter-arguments at the end

On Thu, Jan 5, 2012 at 3:15 PM, William Dunlap <wdunlap at tibco.com> wrote:
> My feeling that everyone would index dataClasses by name was
> wrong.  I looked through the packages that used dataClasses
> and saw code that would break if the first (response) entry
> were omitted.  (I didn't check to see if passing the output
> of delete.response to these functions would be appropriate.)
> E.g.,
> file: AICcmodavg/R/predictSE.mer.r
>  ##matrix with info on factors
>  fact.frame <- attr(attr(orig.frame, "terms"), "dataClasses")[-1]
>
>  ##continue if factors
>  if(any(fact.frame == "factor")) {
>    id.factors <- which(fact.frame == "factor")
>    fact.name <- names(fact.frame)[id.factors] #identify the rows for factors
>
> Some packages create a dataClass attribute for a model.frame
> (not its terms attribute) that does not have any names:
> file: caper/R/macrocaic.R
>   attr(mf, "dataClasses") <- rep("numeric", dim(termFactors)[2])
> .checkMFClasses() does not throw an error for that, but it
> doesn't do any real checking either.
>
> Most users of dataClasses do pass it to .checkMFClasses() to
> compare it with newdata and that doesn't care if you have extra
> entries in dataClasses.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>

I can't understand what your point is.  I agree we can work around the
problem, but why should we have to?

If you confine yourself to the output of "delete.response" applied to
a terms object from a regression, can you point to any package or
usage that depends on leaving the response variable in the dataClasses
attribute?  I can't find one.  In R base, these are all the references
to delete.response:

stats/R/models.R:delete.response <- function (termobj)
stats/R/lm.R:        Terms <- delete.response(tt)
stats/R/lm.R:        Terms <- delete.response(tt)
stats/R/ppr.R:        Terms <- delete.response(object$terms)
stats/R/loess.R:
as.matrix(model.frame(delete.response(terms(object)), newdata,
stats/R/dummy.coef.R:    Terms <- delete.response(Terms)

I've looked it over carefully and predict.lm (in lm.R) would not be
affected by the change I propose. I can't find any usage in loess.R of
the dataClasses attribute.

Furthermore, I can't see how a person would use the dataClasses
attribute at all, after the other markers of the response are
eliminated. How is a method to find which variable is the response,
after response=0?

I'm not disagreeing with you that I can workaround the peculiarity
that the response is left in the dataClasses attribute of the output
object from delete.response.  I'm just saying it is a complication
that programmers should not have to put up with, because I think
delete.response should delete the response from all attributes of a
terms object.

pj


-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas



More information about the R-devel mailing list