[R] Why does debugging print() change output of function?

Sun Sep 7 07:37:53 CEST 2014

On Sep 6, 2014, at 4:24 PM, William Dunlap wrote:

> In your first example I get an error:
>> mtest.data.frame(testdata, valid2=="N", valid3 > 1)
>  Error in mtest.data.frame(testdata, valid2 == "N", valid3 > 1) :
>    object 'valid2' not found
> I expect the error because list(...) ought to evaluate the ... arguments.

Thank you (and JWDougherty) for looking at this. I see that the difference lies in the fact that I have vectors in my workspace that were used  as preliminaries in constructing my test case that are being accessed by my logical expressions.

 group1 <- paste("Group", rep(LETTERS[1:7], sep=''))
  group2 <- c("UNC", "UNC", "SS", "LS", "LS", "SS", "UNC")
  valid1 <- c("Y", "N", NA, "N", "Y", "Y", "N")
 valid2 <- c("N", "N", "Y", "N", "N", "Y", "N")
  valid3 <- c(1.4, 1.2, NA, 0.7, 0.3, NA, 1.7)
  valid4 <- c(0.4, 0.3, 0.53, 0.66, 0.3, 0.3, 0.71)
 valid5 <- c(8.5, 11.2,NA, NA, 8.3, NA, 11.7)

I should have executed rm(list=ls()) and repeated my testing before posting, but you 
> Use substitute() to get the unevaluated ... arguments up front and
> don't use substitute() in the loop over the elements of test.
> 
> There are several ways to get the unevaluated ... arguments.  E.g.,
>  f0 <- function(x, ..., drop=FALSE) match.call(expand.dots=FALSE)$...

>  f1 <- function(x, ..., drop=FALSE) substitute(...())
>  f2 <- function(x, ..., drop=FALSE) as.list(substitute(list(...)))[-1]

These three version are somewhat confusing,  the second one in particular makes it appear that the ellipsis is a function, while the other ones make it appear that they are an expression pointing to a list.

> 
> Your function could be the following, where I also fixed a problem
> with parent.frame() being
> called in the wrong scope and improved,

Yes, I was worried about that.

> IMO, the names on the output data.frame.
> 
> m2 <- function (x, ..., drop = FALSE, verbose = FALSE)
> {
>    tests <- substitute(...())
>    nms <- names(tests) # fix up names, since data.frame makes ugly ones
>    if (is.null(nms)) {
>        names(tests) <- paste0("T", seq_along(tests))
>    }
>    else if (any(nms == "")) {
>        names(tests)[nms == ""] <- paste0("T", which(nms == ""))
>    }
>    if (verbose) {
>        print(tests)
>    }
>    r <- if (length(tests) == 0) {
>        stop("no 'tests'")
>    }
>    else {
>        enclos <- parent.frame() # evaluate parent.frame() outside of FUN()
>        data.frame(lapply(tests, FUN=function(e) {
>            r <- eval(e, x, enclos)
>            if (!is.logical(r)) {
>                stop("'tests' must be logical")
>            }
>            r & !is.na(r)
>        }))
>    }
>    r
> }
> 
> used as:
>> m2(testdata, group2=="UNC", Eleven.Two=valid5=="11.2")
>     T1 Eleven.Two
> 1  TRUE      FALSE
> 2  TRUE       TRUE
> 3 FALSE      FALSE
> 4 FALSE      FALSE
> 5 FALSE      FALSE
> 6 FALSE      FALSE
> 7  TRUE      FALSE
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com

Thank you again, Bill. 

-- 
David.

> 
> 
> On Sat, Sep 6, 2014 at 3:31 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>> The goal:
>>   to create a function modeled after `subset` (notorious for its non-standard evaluation) that will take a series of logical tests as unqiuoted expressions to be evaluated in the framework of a dataframe environment and return a dataframe of logicals:
>> 
>> 
>> mtest.data.frame <-
>> function (x, ..., drop=FALSE)
>> { tests <- list(...); print(tests)
>>     r <- if (length(tests)==0)
>>         stop("no 'tests'")
>>     else { cbind.data.frame(
>>             lapply( tests, function(t){
>>                          e <- substitute(t)
>>                          r <- eval(e, x, parent.frame() )
>>                  if ( !is.logical(r) ) {
>>                      stop("'tests' must be logical") }
>>                     r & !is.na(r) } ) )
>>     }
>> }
>> #--------------
>> 
>> testdata <- structure(list(group1 = structure(1:7, .Label = c("Group A",
>> "Group B", "Group C", "Group D", "Group E", "Group F", "Group G"
>> ), class = "factor"), group2 = structure(c(3L, 3L, 2L, 1L, 1L,
>> 2L, 3L), .Label = c("LS", "SS", "UNC"), class = "factor"), valid1 = structure(c(2L,
>> 1L, NA, 1L, 2L, 2L, 1L), .Label = c("N", "Y"), class = "factor"),
>>    valid2 = structure(c(1L, 1L, 2L, 1L, 1L, 2L, 1L), .Label = c("N",
>>    "Y"), class = "factor"), valid3 = structure(c(4L, 3L, NA,
>>    2L, 1L, NA, 5L), .Label = c("0.3", "0.7", "1.2", "1.4", "1.7"
>>    ), class = "factor"), valid4 = structure(c(2L, 1L, 3L, 4L,
>>    1L, 1L, 5L), .Label = c("0.3", "0.4", "0.53", "0.66", "0.71"
>>    ), class = "factor"), valid5 = structure(c(4L, 1L, NA, NA,
>>    3L, NA, 2L), .Label = c("11.2", "11.7", "8.3", "8.5"), class = "factor")), .Names = c("group1",
>> "group2", "valid1", "valid2", "valid3", "valid4", "valid5"), row.names = c(NA,
>> -7L), class = "data.frame")
>> 
>> #######
>> 
>> 
>>> mtest.data.frame(testdata, valid2=="N", valid3 > 1)
>> [[1]]
>> [1] "tests are"
>> 
>> [[2]]
>> [1]  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE
>> 
>> [[3]]
>> [1]  TRUE  TRUE    NA FALSE FALSE    NA  TRUE
>> 
>> This actually seemed to be somewhat successful, but when ...
>> 
>> Now if I take out the `print()` call for 'tests', I get an different answer:
>> 
>>> mtest.data.frame <-
>> +  function (x, ..., drop=FALSE)
>> +  { tests <- list(...)
>> +      r <- if (length(tests)==0)
>> +          stop("no 'tests'")
>> +      else { cbind.data.frame(
>> +              lapply( tests, function(t){
>> +                           e <- substitute(t)
>> +                           r <- eval(e, x, parent.frame() )
>> +                   if ( !is.logical(r) ) {
>> +                       stop("'tests' must be logical") }
>> +                      r & !is.na(r) } ) )
>> +      }
>> +  }
>>> mtest.data.frame(testdata, valid2=="N", valid3 > 1)
>>>  # i.e. no answer
>> 
>> --

David Winsemius
Alameda, CA, USA