[R] Passing formula as parameter to `lm` within `sapply` causes error [BUG?]

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Wed May 1 00:32:17 CEST 2019


On 30/04/2019 11:24 a.m., Jens Heumann wrote:
> Hi,
> 
> `lm` won't take formula as a parameter when it is within a `sapply`; see
> example below. Please, could anyone either point me to a syntax error or
> confirm that this might be a bug?
> 

I haven't looked carefully at your example.  From a quick glance, 
however, I'd suspect that the issue is with the formula.  Formulas have 
attached environments, where they look up variables in them that aren't 
in the data argument to lm().  In your code it's not obvious to me what 
environment would be attached, but I suspect it's the caller of sapply, 
not the environment that sapply creates for a particular value of its 
argument.  I think this because of a rule that is supposed to be 
followed in R:

   Formulas get the environment where they were created attached to 
them.  That would be your global environment.

R is flexible, so functions don't have to follow this rule, but it 
causes lots of confusion when they don't.

Duncan Murdoch



> Best,
> Jens
> 
> [Disclaimer: This is my first post here, following advice of how to
> proceed with possible bugs from here: https://www.r-project.org/bugs.html]
> 
> 
> SUMMARY
> 
> While `lm` alone accepts formula parameter `FO` well, the same within a
> `sapply` causes an error. When putting everything as parameter but
> formula `FO`, it's still working, though. All parameters work fine
> within a similar `for` loop.
> 
> 
> MCVE (see data / R-version at bottom)
> 
>   > summary(lm(y ~ x, df1, df1[["z"]] == 1, df1[["w"]]))$coef[1, ]
>     Estimate Std. Error    t value   Pr(>|t|)
>    1.6269038  0.9042738  1.7991275  0.3229600
>   > summary(lm(FO, data, data[[st]] == st1, data[[ws]]))$coef[1, ]
>     Estimate Std. Error    t value   Pr(>|t|)
>    1.6269038  0.9042738  1.7991275  0.3229600
>   > sapply(unique(df1$z), function(s)
> +   summary(lm(y ~ x, df1, df1[["z"]] == s, df1[[ws]]))$coef[1, ])
>                   [,1]       [,2]         [,3]
> Estimate   1.6269038 -0.1404174 -0.010338774
> Std. Error 0.9042738  0.4577001  1.858138516
> t value    1.7991275 -0.3067890 -0.005564049
> Pr(>|t|)   0.3229600  0.8104951  0.996457853
>   > sapply(unique(data[[st]]), function(s)
> +   summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ])  # !!!
> Error in eval(substitute(subset), data, env) : object 's' not found
>   > sapply(unique(data[[st]]), function(s)
> +   summary(lm(y ~ x, data, data[[st]] == s, data[[ws]]))$coef[1, ])
>                   [,1]       [,2]         [,3]
> Estimate   1.6269038 -0.1404174 -0.010338774
> Std. Error 0.9042738  0.4577001  1.858138516
> t value    1.7991275 -0.3067890 -0.005564049
> Pr(>|t|)   0.3229600  0.8104951  0.996457853
>   > m <- matrix(NA, 4, length(unique(data[[st]])))
>   > for (s in unique(data[[st]])) {
> +   m[, s] <- summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ]
> + }
>   > m
>             [,1]       [,2]         [,3]
> [1,] 1.6269038 -0.1404174 -0.010338774
> [2,] 0.9042738  0.4577001  1.858138516
> [3,] 1.7991275 -0.3067890 -0.005564049
> [4,] 0.3229600  0.8104951  0.996457853
> 
> # DATA #################################################################
> 
> df1 <- structure(list(x = c(1.37095844714667, -0.564698171396089,
> 0.363128411337339,
> 0.63286260496104, 0.404268323140999, -0.106124516091484, 1.51152199743894,
> -0.0946590384130976, 2.01842371387704), y = c(1.30824434809425,
> 0.740171482827397, 2.64977380403845, -0.755998096151299, 0.125479556323628,
> -0.239445852485142, 2.14747239550901, -0.37891195982917, -0.638031707027734
> ), z = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), w = c(0.7, 0.8,
> 1.2, 0.9, 1.3, 1.2, 0.8, 1, 1)), class = "data.frame", row.names = c(NA,
> -9L))
> 
> FO <- y ~ x; data <- df1; st <- "z"; ws <- "w"; st1 <- 1
> 
> ########################################################################
> 
>   > R.version
>                  _
> platform       x86_64-w64-mingw32
> arch           x86_64
> os             mingw32
> system         x86_64, mingw32
> status
> major          3
> minor          6.0
> year           2019
> month          04
> day            26
> svn rev        76424
> language       R
> version.string R version 3.6.0 (2019-04-26)
> nickname       Planting of a Tree
> 
> #########################################################################
> 
> NOTE: Question on SO two days ago
> (https://stackoverflow.com/questions/55893189/passing-formula-as-parameter-to-lm-within-sapply-causes-error-bug-confirmation)
> brought many views but neither answer nor bug confirmation.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list