[Rd] backquotes and term.labels

Ben Bolker bbolker at gmail.com
Wed Mar 7 14:39:08 CET 2018


I knew I had seen this before but couldn't previously remember where.
https://github.com/lme4/lme4/issues/441 ... I initially fixed with
gsub(), but (pushed by Martin Maechler to do better) I eventually
fixed it by storing the original names of the model frame (without
backticks) as an attribute for later retrieval:
https://github.com/lme4/lme4/commit/56416fc8b3b5153df7df5547082835c5d5725e89.


On Wed, Mar 7, 2018 at 8:22 AM, Therneau, Terry M., Ph.D. via R-devel
<r-devel at r-project.org> wrote:
> Thanks to Bill Dunlap for the clarification.  On follow-up it turns out that
> this will be an issue for many if not most of the routines in the survival
> package: a lot of them look at the terms structure and make use of the
> dimnames of attr(terms, 'factors'), which also keeps the unneeded
> backquotes.  Others use the term.labels attribute.  To dodge this I will
> need to create a fixterms() routine which I call at the top of every single
> routine in the library.
>
> Is there a chance for a fix at a higher level?
>
> Terry T.
>
>
>
> On 03/05/2018 03:55 PM, William Dunlap wrote:
>>
>> I believe this has to do terms() making "term.labels" (hence the dimnames
>> of "factors")
>> with deparse(), so that the backquotes are included for non-syntactic
>> names.  The backquotes
>> are not in the column names of the input data.frame (nor model frame) so
>> you get a mismatch
>> when subscripting the data.frame or model.frame with elements of
>> terms()$term.labels.
>>
>> I think you can avoid the problem by adding right after
>>      ll <- attr(Terms, "term.labels")
>> the line
>>      ll <- gsub("^`|`$", "", ll)
>>
>> E.g.,
>>
>>  > d <- data.frame(check.names=FALSE, y=1/(1:5), `b$a$d`=sin(1:5)+2, `x y
>> z`=cos(1:5)+2)
>>  > Terms <- terms( y ~ log(`b$a$d`) + `x y z` )
>>  > m <- model.frame(Terms, data=d)
>>  > colnames(m)
>> [1] "y"            "log(`b$a$d`)" "x y z"
>>  > attr(Terms, "term.labels")
>> [1] "log(`b$a$d`)" "`x y z`"
>>  >   ll <- attr(Terms, "term.labels")
>>  > gsub("^`|`$", "", ll)
>> [1] "log(`b$a$d`)" "x y z"
>>
>> It is a bit of a mess.
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com <http://tibco.com>
>>
>> On Mon, Mar 5, 2018 at 12:55 PM, Therneau, Terry M., Ph.D. via R-devel
>> <r-devel at r-project.org <mailto:r-devel at r-project.org>> wrote:
>>
>>     A user reported a problem with the survdiff function and the use of
>> variables that
>>     contain a space.  Here is a simple example.  The same issue occurs in
>> survfit for the
>>     same reason.
>>
>>     lung2 <- lung
>>     names(lung2)[1] <- "in st"   # old name is inst
>>     survdiff(Surv(time, status) ~ `in st`, data=lung2)
>>     Error in `[.data.frame`(m, ll) : undefined columns selected
>>
>>     In the body of the code the program want to send all of the right-hand
>> side variables
>>     forward to the strata() function.  The code looks more or less like
>> this, where m is
>>     the model frame
>>
>>        Terms <- terms(m)
>>        index <- attr(Terms, "term.labels")
>>        if (length(index) ==0)  X <- rep(1L, n)  # no coariates
>>        else X <- strata(m[index])
>>
>>     For the variable with a space in the name the term.label is "`in st`",
>> and the
>>     subscript fails.
>>
>>     Is this intended behaviour or a bug?  The issue is that the name of
>> this column in the
>>     model frame does not have the backtics, while the terms structure does
>> have them.
>>
>>     Terry T.
>>
>>     ______________________________________________
>>     R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
>>     https://stat.ethz.ch/mailman/listinfo/r-devel
>>     <https://stat.ethz.ch/mailman/listinfo/r-devel>
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list