[Rd] Fwd: Re: [EXTERNAL] Re: backquotes and term.labels

Ben Bolker bbolker at gmail.com
Thu Mar 8 15:42:40 CET 2018


Meant to respond to this but forgot.

 I didn't write a new terms() function  -- I added an attribute to the
terms() (a vector of the names
of the constructed model matrix), thus preserving the information at
the point when it was available.
  I do agree that it would be preferable to have an upstream fix ...


On Thu, Mar 8, 2018 at 9:39 AM, Therneau, Terry M., Ph.D. via R-devel
<r-devel at r-project.org> wrote:
> Ben,
>
>
> Looking at your notes, it appears that your solution is to write your own
> terms() function
> for lme.  It is easy to verify that the "varnames.fixed" attribute is not
> returned by the
> ususal terms function.
>
> Then I also need to write my own terms function for the survival and coxme
> pacakges?
> Because of the need to treat strata() terms in a special way I manipulate
> the
> formula/terms in nearly every routine.
>
> Extrapolating: every R package that tries to examine formulas and partition
> them into bits
> needs its own terms function?  This does not look like a good solution to
> me.
>
> On 03/07/2018 07:39 AM, Ben Bolker wrote:
>>
>> I knew I had seen this before but couldn't previously remember where.
>> https://github.com/lme4/lme4/issues/441 ... I initially fixed with
>> gsub(), but (pushed by Martin Maechler to do better) I eventually
>> fixed it by storing the original names of the model frame (without
>> backticks) as an attribute for later retrieval:
>>
>> https://github.com/lme4/lme4/commit/56416fc8b3b5153df7df5547082835c5d5725e89.
>>
>>
>> On Wed, Mar 7, 2018 at 8:22 AM, Therneau, Terry M., Ph.D. via R-devel
>> <r-devel at r-project.org> wrote:
>>>
>>> Thanks to Bill Dunlap for the clarification.  On follow-up it turns out
>>> that
>>> this will be an issue for many if not most of the routines in the
>>> survival
>>> package: a lot of them look at the terms structure and make use of the
>>> dimnames of attr(terms, 'factors'), which also keeps the unneeded
>>> backquotes.  Others use the term.labels attribute.  To dodge this I will
>>> need to create a fixterms() routine which I call at the top of every
>>> single
>>> routine in the library.
>>>
>>> Is there a chance for a fix at a higher level?
>>>
>>> Terry T.
>>>
>>>
>>>
>>> On 03/05/2018 03:55 PM, William Dunlap wrote:
>>>>
>>>> I believe this has to do terms() making "term.labels" (hence the
>>>> dimnames
>>>> of "factors")
>>>> with deparse(), so that the backquotes are included for non-syntactic
>>>> names.  The backquotes
>>>> are not in the column names of the input data.frame (nor model frame) so
>>>> you get a mismatch
>>>> when subscripting the data.frame or model.frame with elements of
>>>> terms()$term.labels.
>>>>
>>>> I think you can avoid the problem by adding right after
>>>>       ll <- attr(Terms, "term.labels")
>>>> the line
>>>>       ll <- gsub("^`|`$", "", ll)
>>>>
>>>> E.g.,
>>>>
>>>>   > d <- data.frame(check.names=FALSE, y=1/(1:5), `b$a$d`=sin(1:5)+2, `x
>>>> y
>>>> z`=cos(1:5)+2)
>>>>   > Terms <- terms( y ~ log(`b$a$d`) + `x y z` )
>>>>   > m <- model.frame(Terms, data=d)
>>>>   > colnames(m)
>>>> [1] "y"            "log(`b$a$d`)" "x y z"
>>>>   > attr(Terms, "term.labels")
>>>> [1] "log(`b$a$d`)" "`x y z`"
>>>>   >   ll <- attr(Terms, "term.labels")
>>>>   > gsub("^`|`$", "", ll)
>>>> [1] "log(`b$a$d`)" "x y z"
>>>>
>>>> It is a bit of a mess.
>>>>
>>>>
>>>> Bill Dunlap
>>>> TIBCO Software
>>>> wdunlap tibco.com <http://tibco.com>
>>>>
>>>> On Mon, Mar 5, 2018 at 12:55 PM, Therneau, Terry M., Ph.D. via R-devel
>>>> <r-devel at r-project.org <mailto:r-devel at r-project.org>> wrote:
>>>>
>>>>      A user reported a problem with the survdiff function and the use of
>>>> variables that
>>>>      contain a space.  Here is a simple example.  The same issue occurs
>>>> in
>>>> survfit for the
>>>>      same reason.
>>>>
>>>>      lung2 <- lung
>>>>      names(lung2)[1] <- "in st"   # old name is inst
>>>>      survdiff(Surv(time, status) ~ `in st`, data=lung2)
>>>>      Error in `[.data.frame`(m, ll) : undefined columns selected
>>>>
>>>>      In the body of the code the program want to send all of the
>>>> right-hand
>>>> side variables
>>>>      forward to the strata() function.  The code looks more or less like
>>>> this, where m is
>>>>      the model frame
>>>>
>>>>         Terms <- terms(m)
>>>>         index <- attr(Terms, "term.labels")
>>>>         if (length(index) ==0)  X <- rep(1L, n)  # no coariates
>>>>         else X <- strata(m[index])
>>>>
>>>>      For the variable with a space in the name the term.label is "`in
>>>> st`",
>>>> and the
>>>>      subscript fails.
>>>>
>>>>      Is this intended behaviour or a bug?  The issue is that the name of
>>>> this column in the
>>>>      model frame does not have the backtics, while the terms structure
>>>> does
>>>> have them.
>>>>
>>>>      Terry T.
>>>>
>>>>      ______________________________________________
>>>>      R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
>>>>      https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>      <https://stat.ethz.ch/mailman/listinfo/r-devel>
>>>>
>>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list