[R] Passing formula as parameter to `lm` within `sapply` causes error [BUG?]

peter dalgaard pd@|gd @end|ng |rom gm@||@com
Wed May 1 10:47:10 CEST 2019


Or, following up on the hint by Duncan, this works too

> sapply(unique(data[[st]]), function(s){
+   environment(FO) <- environment()
+   summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ]
+   })  # !!!
                [,1]       [,2]         [,3]
Estimate   1.6269038 -0.1404174 -0.010338774
Std. Error 0.9042738  0.4577001  1.858138516
t value    1.7991275 -0.3067890 -0.005564049
Pr(>|t|)   0.3229600  0.8104951  0.996457853

or even :

> sapply(unique(data[[st]]), function(s){
+   environment(FO) <- environment()
+   summary(lm(FO, data, z == s, w))$coef[1, ]
+   })  # !!!
                [,1]       [,2]         [,3]
Estimate   1.6269038 -0.1404174 -0.010338774
Std. Error 0.9042738  0.4577001  1.858138516
t value    1.7991275 -0.3067890 -0.005564049
Pr(>|t|)   0.3229600  0.8104951  0.996457853
>

> On 1 May 2019, at 07:14 , Jens Heumann <jens.heumann using students.unibe.ch> wrote:
> 
> Thanks a lot for your hint, David. It finally worked doing:
> 
> > sapply(unique(data[[st]]), function(s)
> +   summary(do.call("lm", list(FO, data, data[[st]] == s,
> +                              data[[ws]])))$coef[1, ])
>                [,1]       [,2]         [,3]
> Estimate   1.6269038 -0.1404174 -0.010338774
> Std. Error 0.9042738  0.4577001  1.858138516
> t value    1.7991275 -0.3067890 -0.005564049
> Pr(>|t|)   0.3229600  0.8104951  0.996457853
> 
> Best,
> Jens
> 
> On 30.04.2019 23:03, David Winsemius wrote:
>> Try using do.call
>>>> David
>> Sent from my iPhone
>>> On Apr 30, 2019, at 9:24 AM, Jens Heumann <jens.heumann using students.unibe.ch> wrote:
>>> 
>>> Hi,
>>> 
>>> `lm` won't take formula as a parameter when it is within a `sapply`; see example below. Please, could anyone either point me to a syntax error or confirm that this might be a bug?
>>> 
>>> Best,
>>> Jens
>>> 
>>> [Disclaimer: This is my first post here, following advice of how to proceed with possible bugs from here: https://www.r-project.org/bugs.html]
>>> 
>>> 
>>> SUMMARY
>>> 
>>> While `lm` alone accepts formula parameter `FO` well, the same within a `sapply` causes an error. When putting everything as parameter but formula `FO`, it's still working, though. All parameters work fine within a similar `for` loop.
>>> 
>>> 
>>> MCVE (see data / R-version at bottom)
>>> 
>>>> summary(lm(y ~ x, df1, df1[["z"]] == 1, df1[["w"]]))$coef[1, ]
>>>  Estimate Std. Error    t value   Pr(>|t|)
>>> 1.6269038  0.9042738  1.7991275  0.3229600
>>>> summary(lm(FO, data, data[[st]] == st1, data[[ws]]))$coef[1, ]
>>>  Estimate Std. Error    t value   Pr(>|t|)
>>> 1.6269038  0.9042738  1.7991275  0.3229600
>>>> sapply(unique(df1$z), function(s)
>>> +   summary(lm(y ~ x, df1, df1[["z"]] == s, df1[[ws]]))$coef[1, ])
>>>                [,1]       [,2]         [,3]
>>> Estimate   1.6269038 -0.1404174 -0.010338774
>>> Std. Error 0.9042738  0.4577001  1.858138516
>>> t value    1.7991275 -0.3067890 -0.005564049
>>> Pr(>|t|)   0.3229600  0.8104951  0.996457853
>>>> sapply(unique(data[[st]]), function(s)
>>> +   summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ])  # !!!
>>> Error in eval(substitute(subset), data, env) : object 's' not found
>>>> sapply(unique(data[[st]]), function(s)
>>> +   summary(lm(y ~ x, data, data[[st]] == s, data[[ws]]))$coef[1, ])
>>>                [,1]       [,2]         [,3]
>>> Estimate   1.6269038 -0.1404174 -0.010338774
>>> Std. Error 0.9042738  0.4577001  1.858138516
>>> t value    1.7991275 -0.3067890 -0.005564049
>>> Pr(>|t|)   0.3229600  0.8104951  0.996457853
>>>> m <- matrix(NA, 4, length(unique(data[[st]])))
>>>> for (s in unique(data[[st]])) {
>>> +   m[, s] <- summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ]
>>> + }
>>>> m
>>>          [,1]       [,2]         [,3]
>>> [1,] 1.6269038 -0.1404174 -0.010338774
>>> [2,] 0.9042738  0.4577001  1.858138516
>>> [3,] 1.7991275 -0.3067890 -0.005564049
>>> [4,] 0.3229600  0.8104951  0.996457853
>>> 
>>> # DATA #################################################################
>>> 
>>> df1 <- structure(list(x = c(1.37095844714667, -0.564698171396089, 0.363128411337339,
>>> 0.63286260496104, 0.404268323140999, -0.106124516091484, 1.51152199743894,
>>> -0.0946590384130976, 2.01842371387704), y = c(1.30824434809425,
>>> 0.740171482827397, 2.64977380403845, -0.755998096151299, 0.125479556323628,
>>> -0.239445852485142, 2.14747239550901, -0.37891195982917, -0.638031707027734
>>> ), z = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), w = c(0.7, 0.8,
>>> 1.2, 0.9, 1.3, 1.2, 0.8, 1, 1)), class = "data.frame", row.names = c(NA,
>>> -9L))
>>> 
>>> FO <- y ~ x; data <- df1; st <- "z"; ws <- "w"; st1 <- 1
>>> 
>>> ########################################################################
>>> 
>>>> R.version
>>>               _
>>> platform       x86_64-w64-mingw32
>>> arch           x86_64
>>> os             mingw32
>>> system         x86_64, mingw32
>>> status
>>> major          3
>>> minor          6.0
>>> year           2019
>>> month          04
>>> day            26
>>> svn rev        76424
>>> language       R
>>> version.string R version 3.6.0 (2019-04-26)
>>> nickname       Planting of a Tree
>>> 
>>> #########################################################################
>>> 
>>> NOTE: Question on SO two days ago (https://stackoverflow.com/questions/55893189/passing-formula-as-parameter-to-lm-within-sapply-causes-error-bug-confirmation) brought many views but neither answer nor bug confirmation.
>>> 
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com



More information about the R-help mailing list